Abstract
Recent decades have seen a considerable increase in delegation to independent regulatory agencies, which has been justified by reference to the superior performance of these bodies relative to government departments. Yet, the hypothesis that more independent regulators do better work has hardly been tested. We examine the link using a comprehensive measure of the quality of work carried out by competition authorities in 30 Organisation for Economic Co-Operation and Development (OECD) countries, and new data on the design of these organizations. We find that formal independence has a positive and significant effect on quality. Contrary to expectations, though, formal political accountability does not boost regulatory quality, and there is no evidence that it increases the effect of independence by reducing the risk of slacking. The quality of work is also enhanced by increased staffing, more extensive regulatory powers, and spillover effects of a more capable bureaucratic system.
Introduction
In a process beginning in the 1970s, and accelerating through the 1990s, countries across the world delegated policy competences to independent regulatory agencies (IRAs; for example, Gilardi, 2005; Jordana, Levi-Faur, & Fernández i Marín, 2011; Majone, 1997). Such agencies operate under the authority of appointed rather than elected officials, and are insulated from electoral pressure. The creation of IRAs developed unevenly across countries and policy areas, with the United States being the policy innovator, and financial regulators and competition authorities being among the first to be established, followed later by utility and social regulators (Jordana et al., 2011, p. 1346).
Decisions to delegate to IRAs partially resulted from policy diffusion by means of emulation or the social construction of necessary institutional choices (Gilardi, 2005; Jordana et al., 2011). As such, the intellectual argument for their establishment in each country was never fully articulated from first principles. To the extent that agency creation was explicitly justified, the central argument was that IRAs were able to make more efficient and effective policy decisions than politicians and government departments, either because of their superior expertise and skills, or because of certain biases or deficiencies present in decision-making by democratically elected officials (Gilardi, 2002; Majone, 1999).
Studies of IRAs have concentrated on the factors that account for formal independence, often looking at the role of political and market-making functional imperatives (Elgie & McMenamin, 2005; Gilardi, 2002; Majone, 1997). From a normative perspective, though, this focus on initial delegation seems to miss the point. Delegation to IRAs weakens the link between the exercise of state power and voters, and leads to an increase in the power of unelected technocrats. The standard justification of the existence of IRAs rests on their “superior performance [ . . . ] relative to the result that would be likely if elected politicians were to perform the functions themselves” (Thatcher & Stone Sweet, 2002, p. 18). Yet, in an era in which few if any maintain the Saint-Simonian belief that technocratic decision-making is inherently better than democratic decision-making, the reduction of democratic legitimacy requires some evidence that IRAs do in fact exercise authority in such a superior fashion. That is, we need evidence on whether or not granting (more) independence to IRAs results in better quality work.
Such evidence is hard to come by. Quality is an inherently evaluative and synoptic concept. As such, scholars who are hesitant to use evaluative criteria because of the potential of serious disagreement will seek to avoid investigating quality per se. At most, they will look at discrete quantifiable improvements related to stated goals within a sector. Indeed, previous research has assessed how IRAs promote particular desirable outcomes, such as higher network penetration and lower interconnect rates in telecommunications (Edwards & Waverman, 2006; Gutiérrez, 2003; Wallsten, 2001), more financial stability in banking (Jordana & Rosas, 2014), and greater financial leverage in a range of sectors (Bortolotti, Cambini, Rondi, & Spiegel, 2011).
Yet, the use of outcome measures has important drawbacks. Not only are quantifiable measures associated with gaming and neglect of other objectives (Bevan & Hood, 2006), but they are also affected by much more than the regulation involved, which makes identifying the specific contribution of independence challenging. In many instances, IRA creation diffused rapidly, and, in the case of network industries, was concomitant with sectoral liberalization, which makes it extremely difficult to disentangle the effect of IRAs from that of other changes occurring at the same time. In statistical terms, the few postliberalization data points before the IRA is created exert considerable leverage. In practical terms, it is difficult to imagine many liberalized European markets existing without an independent sectoral regulator.
In this study, we seek to provide evidence on the link between political independence and regulatory quality by focusing on one particular area of regulation: competition policy, or “the set of policies and laws which ensure that competition in the marketplace is not restricted in a way that is detrimental to society” (Motta, 2004, p. 30). Competition policy is inherent to the mixed economy, aiming primarily at market correction rather than creation. This allows us to minimize the risk that our findings depend critically on the way in which regulation was used to create new markets after liberalization.
We seek to improve on previous literature in three respects. First, in line with a growing body of literature in economics and political science, we view independence as a matter of degree rather than as a quality which is either present or absent (see, for example, Cukierman, Webb, & Neyapti, 1992; Gilardi, 2002; Hanretty & Koop, 2012; Selin, 2015). We accordingly reframe arguments linking political independence and regulatory quality so that they make sense when independence is conceived in this way.
Second, besides independence, we take into consideration accountability, the second most discussed institutional feature of regulators. Accountability provisions are relevant for all policy actors, but they are particularly important for IRAs. As they combine rule-making, enforcement, and adjudication, IRAs have a special duty to give account for their actions (Majone, 1999; cf. Maggetti, Ingold, & Varone, 2013; Scott, 2000). Accountability also matters for performance. It may not only reduce the likelihood of misuse and abuse of political power but may also enhance the accuracy of information gathering and analysis (Patil, Vieider, & Tetlock, 2014). This is why several authors have emphasized the need for regulators to be accountable as well as independent (e.g., Busuioc, 2009; Majone, 1999; Quintyn & Taylor, 2007). We therefore assess the impact of independence as well as accountability, distinguishing the two concepts conceptually and empirically, and analyzing new data on the formal political independence and accountability of competition authorities.
Third, we use a comprehensive proxy measure of regulatory quality—in our case, the quality of competition policy enforcement—which is produced by an external company (the Global Competition Review or GCR) based on expert opinion and additional data. The measure centers on the enforcement process, capturing the quality of the work done by competition authorities, including their economic and legal analyses and their intermediate output. Consequently, we are able to focus on the contribution of the authorities themselves, rather than on certain market outcomes which are affected by many other factors. The data on regulatory quality allow us to assess the impact on quality of the formal independence and accountability of competition authorities in 30 member states of the Organisation for Economic Co-Operation and Development (OECD), for the period 2005-2014. The results of our ordered probit regression of regulatory quality show that independence indeed improves quality. However, the effect of accountability is not in the hypothesized direction, raising questions of institutional design and the role of accountability more generally.
Independence and Regulatory Quality
Independence is the most commonly discussed characteristic of IRAs. The concept serves two purposes in the literature. First, it demarcates the scope of the genus: A regulator must be independent in some minimal sense to be considered an IRA. For Thatcher (2005), for instance, the minimum requirements for categorization as an IRA comprise “(1) that the agency has its own powers and responsibilities given under public law; (2) that it is organizationally separated from ministries; and (3) that it be neither directly elected nor managed by elected officials” (p. 352; cf. Jordana et al., 2011, p. 1351; Majone, 1999, p. 2). Second, the concept of independence allows us to differentiate within the category of IRAs. That is, independence is a matter of degree, with some regulators being more independent than others.
In this study, we focus on political independence, or the independence of regulators from elected politicians and members of government. In concentrating on political independence, we do not mean to imply that independence from other actors—and from the regulated sector in particular—is unimportant; merely that it is independence from politics that is implied by the literature, justifying the creation of IRAs. By the political independence of an agency, we mean “the degree to which the agency takes day-to-day decisions without the interference of politicians in terms of the offering of inducements or threats and/or the consideration of political preferences” (Hanretty & Koop, 2013, p. 196).
Political independence defined this way is a property of agencies’ behavior, or what actually happens. It is common in the literature to make a distinction between formal and actual independence, or between what is true de jure and de facto. This has been the case since the early literature on de jure and de facto central bank independence (Cukierman et al., 1992). We concentrate on formal, or de jure independence, which we conceptualize as the degree to which there are statutory provisions that decrease the possibilities for politicians to influence agency decisions before they are made. We focus on formal independence because this is the feature of IRAs that can most easily be “engineered,” and because there is considerable evidence linking grants of formal independence to higher degrees of actual independence, at least in established democracies (e.g., Ennser-Jedenastik, 2015; Hanretty & Koop, 2013). Nonetheless, our findings do not automatically translate to de facto independence.
How then should we expect formal independence and regulatory quality to be related? Although early studies of independent regulatory commissions in the United States already associated independence with higher levels of professionalism, consistency, and policy continuity (see Bernstein, 1955, Chapter 5), most theorization of the link has been done in recent decades. Two main causal arguments have been put forward: an expertise-based and a credible commitment-based argument.
The argument from expertise maintains that, compared with politicians and government departments, independent agencies have better access to (or can better process) information, and that better information leads to better work. Though bureaucrats generally have better information than politicians, this does not explain why bureaucrats in an independent agency should have better information than bureaucrats in a government department and drawn from a career civil service. To explain this, we have to examine the act of delegation.
Bawn (1995) explains that in setting the level of agency independence, politicians also make decisions about the administrative procedures agencies may employ, and these procedures may limit the information available to the agency, and the ability to assess policy consequences and make decisions based on expertise. Although the examples Bawn gives are drawn from the study of executive agencies in the United States, initial decisions about grants of independence have also affected the ability of IRAs in Europe to draw on expertise. Limitations may come not only through obligations to consult with the more generalist and political leadership of government departments but also through restrictions on the use of outside consultancies or on the hiring of short-term experts.
This claim has been challenged. Gailmard and Patty (2007) argue that the link between independence and expertise is conditional on bureaucratic selection and retention: Only policy-motivated “zealots” protected by long tenure will invest in expertise, whereas other agency staff (“slackers”) will not because of a lack of incentives. That is, bureaucratic expertise is costly to develop and relationship-specific, while the specific relationship may not continue. Though these findings have some relevance for IRAs, we expect the effect of independence to be stronger for these bodies as the relevant (regulatory) expertise is less relationship-specific than is assumed by Gailmard and Patty. 1 Thus, if expertise is an ingredient of regulatory quality, the literature should lead us to expect a positive effect of independence on quality.
A somewhat more recent strand of literature focuses on the role of credible commitment in regulation (e.g., Gilardi, 2002; Levy & Spiller, 1996; Majone, 1996). The origins of the link between independence, credible commitment, and better policy-making lie in the field of monetary policy, where it is common to argue that politicians use monetary policy to boost the economy (i.e., increase employment and growth), that politicians’ actions in this regard are rationally expected and anticipated by price- and wage-setters, and that as a result, discretionary monetary policy leads only to above-target inflation rather than to the desired gains in employment and output (Kydland & Prescott, 1977). To avoid inflationary bias, and to deal with the time inconsistency in monetary policy, politicians may wish to tie their hands and delegate to an independent and more “conservative” central bank (Rogoff, 1985).
Regulation scholars have argued that time inconsistency is important in their field, too. Politicians wish to commit to long-term regulatory policies, but face incentives to deviate and pursue politically attractive short-term policy options (Majone, 1996). For instance, while price caps are regulatory instruments to correct market failure in monopolistic or oligopolistic sectors, they may also be used by politicians for electoral purposes. Yet, anticipating such use of caps, potential investors may stay or move away from these sectors, leading to markets being hampered rather than corrected (cf. Levy & Spiller, 1996). By delegating to IRAs headed by officials with different time horizons and incentives, politicians can tie their hands and commit credibly to long-term regulatory objectives and better decisions.
Summarizing the arguments, agency independence may sever the link between the political preferences and policy decisions—a point we come back to in the next section—but it enhances regulatory quality, whether we follow the logic of expertise or the logic of commitment. Hence, we hypothesize as follows:
Accountability and Regulatory Quality
Political accountability plays a crucial role in normative discussions of IRAs. That is, if regulators are not accountable to anyone, their legitimacy is called into question (Majone, 1999; Scott, 2000). Accountability does not feature prominently in empirical studies though. The main reason is that many scholars equate it with control, which is itself considered to be the inverse of independence. Hence, mechanisms designed to ensure accountability by definition compromise independence, and assessing both independence and accountability becomes nonsensical. 2 Yet, a closer look at the two concepts may lead us to treat them as separate dimensions.
Scholars working on accountability typically define the concept in terms of answerability, or the ability to provide information on, and explanation of, one’s conduct. 3 For Philp (2009), for instance, “A is accountable with respect to M when some individual, body or institution, Y, can require A to inform and explain/justify his or her conduct with respect to M” (p. 32; cf. Scott, 2000, p. 40). Analogous to our treatment of independence, we concentrate on formal accountability to politicians, acknowledging all the while that the actual use of accountability may not correspond to what provisions prescribe, and that other forms of accountability (horizontal and downwards) may also matter. We say that an IRA is formally accountable to politicians to the extent that politicians can require the agency to provide information on, and explanation of, its conduct on the basis of statutory provisions rather than nonlegal forms of compulsion.
Defined in this way, the independence of IRAs is compatible with some degree of accountability. That is, agencies may make their day-to-day decisions independently from politicians and political preferences, but may still be required to provide information on, and explanation of, these decisions ex post facto. Indeed, recent empirical research has shown that agencies can be both independent and accountable—they can “have their cake and eat it, too” (Maggetti et al., 2013). 4 Thus, accountability and independence need to be operationalized carefully and separately, and their effects disentangled.
Systematic empirical research on the impact of accountability on policy-making is rare. As Dubnick and Frederickson (2011) put it, the literature has largely failed to go beyond “promises” of what accountability can achieve. Yet, there are two literatures that explicitly link the two concepts: the constitutionalist and the social-psychological literature.
For constitutionalist scholars, first of all, the role of accountability—and checks and balances more generally—is one of preventing and detecting abuse and misuse of political power. Constitutionalism is grounded in a considerable distrust of power-holders. As Loewenstein (1957) put it, “A stigma is attached to power, and only the saints among the power holders—rarely found—are able to resist the temptation of abusing it” (p. 8). The solution to the problem is believed to lie in institutional design, which would allow us to constrain and restrain the exercise of political power.
These ideas have found expression in the Federalist Papers and the U.S. Constitution, and they are at the heart of constitutionalist principles such as limited government, checks and balances, separation of powers, and judicial review. They are also reflected in the literature on the role of accountability arrangements. That is, requirements to provide information and justification are considered to help prevent and detect arbitrary exercise and abuse of public authority, including the misuse of public funds, the pursuance of particular rather than general interests, the unequal or unfair treatment of citizens (or companies), the neglect of rights and freedoms, and corruption and patronage (e.g., Bovens, 2007). This is not only important from a normative point of view, but it may also effect organizational performance. If political power-holders such as regulatory agencies may be tempted to misuse and abuse their public authority, if these activities are associated with worse performance, and if accountability mechanisms may provide incentives not to engage in these activities, we shall expect higher degrees of accountability to be associated with higher quality of (regulatory) decision-making.
The microfoundations of the link between accountability and performance can be found in the field of social psychology and, in particular, the work by Tetlock (see Patil et al., 2014). What is stressed in this literature is the role of accountability as a generator of implicit or explicit expectations “that one may be called on to justify one’s beliefs, feelings and actions to others” (Lerner & Tetlock, 1999, p. 255). These expectations play an important role in decision-making as they lead actors to be less prone to the fundamental attribution error (Tetlock, 1985), and to process information more carefully (Tetlock, 1983) and accurately (Mero & Motowidlo, 1995).
The link between accountability and the quality of decision-making has mainly been established at the individual level. As our research focuses on organizations, we need to be careful when formulating our expectations: Organizational decision-making cannot be treated as a simple aggregate of individual-level processes. Yet, there is at least some evidence suggesting that the effect of accountability on the use of information operates at the individual as well as the group level (Weldon & Gargano, 1988).
However, studies of accountability do not unequivocally point to a positive effect on performance. Accountability is also associated with excessive costs and red tape. This is particularly true for provisions that are generic in nature, including standardized and routine requirements such as obligations to produce annual reports and evaluation protocols (Bovens & Schillemans, 2014). Indeed, scholars have emphasized how sensitive the design of accountability is, and how important appropriate design is for accountability to have desirable effects on performance (May, 2007; Quintyn & Taylor, 2007). Nonetheless, as accountability provisions are mainly linked to better performance, we formulate the following hypothesis:
Accountability may not only have a direct effect on the quality of regulatory decision-making but may also strengthen the effect of independence. For this argument to make sense, we need to have a closer look at the differential effect of independence. As set out in the “Independence and Regulatory Quality” section, political independence can increase the level of expertise in regulatory decision-making, and reduce short-termism in the process. Yet, insulating regulatory agencies from politicians comes at a cost: By granting higher degrees of political independence, one also enhances the possibility of slack and bureaucratic drift (e.g., Bawn, 1995; Gailmard & Patty, 2007; McCubbins, Noll, & Weingast, 1987). That is, regulators may actually do something other than that which they were initially directed to do by their political principals.
Partially, political independence is precisely intended to create the possibility for regulatory agencies to drift. Following the logic of credible commitment, regulators shall stick to long-term policy objectives even if their political principals want them to (temporarily) move away from these objectives. Hence, the potential decline in policy responsiveness to politicians’ preferences may constitute an asset rather than a cost. However, policy unresponsiveness is not the only concern; bureaucratic drift may also take a more procedural form. Political independence removes some checks and balances from the policy process as it reduces the opportunities for politicians to keep an eye on the way in which agencies carry out their work. Reduced oversight makes it, ceteris paribus, easier for regulators to exercise their power arbitrarily and to misuse their funds. It may result in more corner-cutting, fewer procedural checks, more superficial investigations of complaints, and lower levels of due process. In sum, political independence raises concerns of a constitutionalist nature.
Ideally, regulators are designed in such a way that independence enhances levels of expertise and long-termism, while opportunities for misuse and abuse of power are being curbed. Such a design may be created by complementing political independence with provisions for accountability (Majone, 1999; Quintyn & Taylor, 2007). If the two concepts are, indeed, “complementary and mutually supporting” (Majone, 1999, p. 14), accountability can serve as a guard against abuse and misuse, as regulators are called upon to explain and justify their actions—or lack thereof (cf. Quintyn & Taylor, 2007, p. 35). Accordingly, we hypothesize that the effect of independence will depend on the level of accountability, and more specifically as follows:
Data and Operationalization
Having hypothesized the relationship between independence, accountability, and the quality of regulatory decision-making, we now turn to the operationalization of these and other variables.
Independence and Accountability
Our measures of formal political independence and accountability are based on the analysis of the statutory provisions governing independent competition authorities in the 30 OECD countries included in our analysis. As we are interested in the variation in the design of IRAs, we excluded from our analysis those competition authorities that were part of a ministerial hierarchy in any given year. 5 Thus, the Antitrust Division of the United States Department of Justice was excluded, as were the Belgian Directorate General for Competition (operating until 2013), Spain’s Competition Service (operating until 2007), and the nonindependent Dutch Competition Authority (operating until 2005).
Over the period in question (2005-2014), we observed considerable within-country (and within-agency) variation in design. Incorporating this variation in the analysis is important as it allows for more controlled comparison than does the cross-country variation. We therefore disaggregated competition authorities, and considered competition authorities which had been reformed, or which have been subjected to alterations in their independence or accountability, as separate bodies. Thus, although we have data for 30 countries, we have information for 46 regulators, or regulator-spells. 6
For each authority, we recorded information regarding different items relating to their statutory design. The selection of the formal political independence items was based on previous literature, particularly the work by Gilardi (2002). For each item, we coded the agency’s “response,” or the category that best described the statutory provisions imposed upon the agency. For some items—for example, provisions which are either present or absent, and which are commonly found in statutes governing competition authorities—the coding was relatively straightforward. For other items, though, the coding was more difficult, either because the provision found in the statute was not immediately reconcilable with the list of options or because the relevant provision was a matter of administrative law more generally. The coding was carried out by one of the authors and a research assistant; differences in coding were discussed by the authors and reconciled by mutual agreement.
To assess the scalability of the items, and to turn our ordinal measurements on several items into a single measure, we used a latent trait model based on item response theory (IRT; cf. Hanretty & Koop, 2012; Selin, 2015). IRT models treat the observed item responses as a function of an unobserved latent variable—in our case, the trait we refer to as independence. They allow for the estimation of the weights of items, and the scores of item responses, on the latent trait, and for the exclusion of those items for which reasonable weights and scores cannot be estimated. IRT models are similar to confirmatory factor analysis in the sense that both attempt to identify an underlying latent factor. Yet, while factor analysis requires the inclusion of continuous data, IRT models do not have such a requirement (Raju, Laffitte, & Byrne, 2002, p. 520). This makes IRT models more appropriate for our data structure, which includes items that are measured categorically and even dichotomously. Our aggregate measure of independence is the measures that best “explains” the observed pattern of responses, given a set of assumptions.
Formally speaking,
7
suppose we have information for
where
Independence Items.
Accountability Items.
Our final measure of independence includes all items originally used by Gilardi (2002), with three exceptions. First, we excluded items that are conceptually linked to the accountability and powers of agencies rather than to independence. These are also items which previous research on IRAs demonstrated to be unrelated to the latent trait (see Hanretty & Koop, 2012). Second, we excluded items which turned out to be unrelated to the trait of formal independence, thus excluding provisions on items such as whether or not board appointments may be renewed. Our findings in this respect are largely in line with those in previous work on a larger sample of IRAs (Hanretty & Koop, 2012). Third, we excluded items which lacked variation in our specific sample of competition authorities. We were left with 18 items in total. These items, the response categories, and the latent trait data are shown in Table 1. 8
We followed the same procedure to create an aggregate measure of formal political accountability. Again, the selection of these items was based on previous literature. Specifically, we followed Koop (2011), except that we excluded items that lacked variation in our sample of competition authorities. The remaining items relating to accountability, and the different responses to these items, are shown in Table 2. We were left with 11 items in total.
The aggregate measures are shown in Table 3, which lists all authorities in alphabetical order. The table also shows the limited association between the formal independence of authorities and their formal accountability toward politicians: the Pearson correlation coefficient between the two variables measured the agency level is only r = −.16, which is not significant at standard levels (df = 29, p = .39).
Independence, Accountability, and Ratings of Agencies Over Time.
Regulatory Quality
There is a broad literature on “regulatory quality.” This is due in part to the multidimensionality of the concept. Indeed, “high-quality” regulatory work refers to work that is efficient, proportionate, legitimate, consistent, not unduly prescriptive, and enforceable. Yet, even if we agree on the characteristics, the measurement of quality, or of its constituent characteristics, remains difficult. Researchers have generally accepted overall measures of regulatory quality at the country level, as is seen from the widespread use of World Bank Governance indicators (Kaufmann, Kraay, & Mastruzzi, 2010). It is both possible and desirable to seek out measures that are finer-grained both in terms of their level of application (organizational rather than country level), and in terms of the facets of quality they examine. Though the GCR’s comprehensive proxy measure of the quality of competition policy enforcement has, as we will point out, its weaknesses, it comes closest to what we consider desirable.
Three approaches to quality in regulation can be identified in the literature: those focusing on process, activity, and real-world outcomes (Radaelli, 2004). Process-based approaches conceive of regulatory quality as a characteristic of the work of regulators such that the procedures, investigations, analyses, and intermediate outputs are of good quality, judged according to standards internal to the kind of work. Thus, high-quality work may refer to well-targeted investigations, clear and well-informed legal analyses, and state-of-the-art econometric analyses. Activity-based approaches, on the contrary, understand quality as a characteristic of regulatory work such that regulators doing more work (e.g., more cartel investigations, more sectoral inquiries) are of better quality. Finally, outcome-based approaches take quality to mean regulatory work that promotes those outcomes that regulators are meant to promote.
Outcome-based approaches have so far been most common. Yet, as set out in the “Introduction” section, outcomes are affected by many other variables for which it is hard to control. Activity-based approaches also have their problems: In the case of competition authorities, high levels of activity may either imply that the agency is doing a good job—that is, it manages to detect noncompetitive behavior—or that it is failing to do its job well—that is, it has not managed to create competitive markets. Moreover, to the extent that certain activities are included in targets, levels of activity may be affected by gaming, with other, nonmeasured activities being neglected. For those reasons, we rely on a measure which is primarily process-based. We concentrate on high-quality processes, as opposed to high levels of activity or high-quality outcomes (though it would be surprising if high-quality processes persistently failed to deliver good outcomes).
Whereas our measures of independence and accountability are based on collected data, our measure of the quality is based on what might be termed found data. The GCR has, since 2000, published an annual report called Rating Enforcement. This rates the work of competition agencies from across the world in the past year. The first edition rated 16 agencies in 13 countries, with the United States being the only non-European country. The 2015 edition which we use rates 36 agencies from 34 countries, including IRAs from Asia and Latin America (GCR, 2015). We use these ratings as the basis for our analysis, excluding countries from non-OECD countries. 9
The ratings published by the GCR are produced by combining the results of an expert survey with quantitative data and qualitative judgments made by the GCR editorial team. The methods used to produce these ratings have changed over time. Since 2005, ratings have been ascribed to individual agencies rather than to competition regimes, which had happened in the years before. We therefore use the ratings for the years 2005 to 2014. Experts are invited to give an aggregate star rating to agencies with which they are familiar (thus permitting—and indeed encouraging—experts to judge agencies from different countries), and are then invited to give open-ended responses to questions on the regulator(s).
GCR describes the expert survey as a “user’s survey,” defining a user of a competition authority as anyone with “cause to liaise with a competition agency on a proceeding whether as a private practitioner, an in-house counsel, business executive, consumer representative, or as an economist,” though it notes that the majority of respondents tend to be lawyers and economists (GCR, 2005, p. 2). The list of respondents was constructed using trade publications, including but not limited to The International Who’s Who of Competition Lawyers, The International Who’s Who of Competition Economists, 40 Under 40, Women in Antitrust, and The Who’s Who of Public Affairs. Subscribers to the service were also given the possibility to fill out the survey. For identified experts, the response rate was approximately one in seven (GCR, 2006, p. 2). The eventual ratings depend on the results of this expert survey, and on the opinion of GCR staff, who conduct in-depth interviews and analyses of statistical data on budgets, caseloads, staffing, and the reporting in the GCR itself. These opinions are included to introduce “flexibility,” and to make ratings more comparable across countries (GCR, 2011, p. 11).
Though we do not know exactly what it is about the enforcement of competition policy that discriminates between the different ratings, the GCR does provide insight into this question, and this information leads us to conclude that the notion of quality captured by the GCR is related to a process-based understanding of quality, rather than a real-world outcomes or activity-based approach. First, the ratings aim to capture the performance of the agencies and specifically disavows consideration of outcomes in terms of the success of competition policy or the performance of economy (GCR, 2008). Second, the ratings primarily incorporate assessments of the overall quality of the process of enforcement rather than the output. Consequently, competition authorities that only take decisions on cases that are brought before them by another authority—tribunal-like bodies such as the Canadian Competition Tribunal and the Finnish Market Court—are excluded. The latter are considered “passive” authorities, while the GCR is interested in the “active” part of enforcement—the monitoring and the investigations. Although the GCR emphasizes that it collects “a great deal of data on agency activity,” including data on merger, cartel and abuse of dominance activity, and policy and advocacy work (GCR, 2005, p. 2), these statistics are only ever read in conjunction with the questions in the surveys which ask users to evaluate, for each agency they are familiar with, its areas of excellence, the area that could improve, the margin of improvement over the past 5 years, markets which the agency does and does not understand, the suitability of the agency head to the agency’s needs, and the agency’s morale. While we can never be sure about the motivation of users when they rated competition authorities—which is a more general problem of (inter)subjective measures—the fact that they were asked to reflect on these specific issues makes it at least more likely that they rated quality rather than other features which they were (dis)satisfied with.
The GCR also sets out what type of factors contribute to higher ratings (GCR, 2005, 2007). Features that play such a role are (a) vigorously pursuing in all areas the agency is responsible for, including politically sensitive areas such as domestic cartels, abuse of dominance—particularly by former state-owned companies—and government-sponsored distortions of competition; (b) not overusing informal settlement processes, as “[w]here the law is ‘grey,’ at a certain point agencies have to make an effort to clarify it” (GCR, 2005, p. 4); (c) being a real enforcer rather than investing mainly in advocacy work, market studies, and “trial by media,” with courtroom victories being an important indicator (GCR, 2012, 2014); (d) demonstrating leadership in competition policy by means of research and development work and participation in debates on where the lines between pro- and anticompetitive behavior lie; (e) being a learning organization, using state of the art methodologies and engaging in continuous self-assessment; (f) taking an economic approach rather than a too formalistic one with an emphasis on the per se rule of anticompetitive behavior; and (g) demonstrating leadership in international cooperation. Hence, this is a rich understanding of quality in terms of process which incorporates substantial elements of phronesis or practical judgment.
Because this is “found data,” and provided on a commercial basis, we cannot calculate measures of interexpert reliability. However, GCR states that their star ratings demonstrate convergent validity, in that agencies with higher star ratings were also more likely to be cited by the agencies themselves as “most admired” in a special 2006 survey question (GCR, 2006, p. 2). In addition, the ratings—which are awarded on a scale of 1 to 5, with half-stars permitted (and even quarter-stars between 2005 and 2006) —demonstrate a form of test–retest reliability: The average Spearman between agency ratings in adjacent years is high, at .958.
10
Finally, the measures correlate reasonably well with two relevant outcome measures from the World Economic Forum’s (WEF) Global Competitiveness Report (Schwab & Sala-i-Martín, 2015). The first measure asks survey respondents to rate the “effectiveness of antimonopoly policy” on a 1 to 7 scale; the Spearman correlation between GCR rankings and WEF survey responses is moderate (
Control Variables
Based on theoretical considerations and the findings of previous studies, we include six control variables: size, specialization, experience, agency powers, government effectiveness, and EU membership.
First, we include in our models the log of the number of staff working on competition policy within the agency. This is not equal to the number of staff working within the agency, because some staff may work on noncompetition related tasks, such as consumer affairs or other areas of regulation if the agency is a multisectoral regulator. We include this control because agencies that are in some sense bigger may produce better work because they can afford to specialize in different sectors of the economy or in different types of analysis. We take the log of this number because we expect the effects of specialization to show a decreasing marginal rate of return. Information on this variable comes from successive editions of the GCR. Missing values for a particular agency-year observation were carried forward (backwards) from the last (next) agency-year with nonmissing information.
Second, as a more targeted measure of specialization, we include in our models the proportion of staff within the agency who work on competition issues (rather than on, for example, consumer protection or sectoral regulation). We include this control variable because formal modeling suggests that specialization can bolster independence (Dewatripont, Jewitt, & Tirole, 1999). Information on this variable comes from successive editions of the GCR.
Third, we include the log of (one plus) the number of years since the establishment of a competition authority organizationally separate from a government ministry. We include this control because we believe that agencies with more experience perform better—or that agencies which have no previous experience of competition policy to draw upon face teething difficulties in their early years. We use information from Jordana et al. (2011) to identify the year of establishment of the first organizationally separate competition authority.
Fourth, we include a measure of the agency’s powers. We take into consideration what can be seen as the “basic powers” of IRAs, with variation across agencies still present. These powers are (a) the initiation of investigations, (b) the imposition of administrative fines, (c) the introduction of generally binding rules (delegated or secondary legislation), and (d) the establishment of rules of procedure. The measure is similar in spirit to our measures of independence and accountability, the aggregation being based on an analysis of the responses to the four items. The items loadings are presented in Table 4.
Powers Items.
Fifth, we include in our models a measure of government effectiveness in each country. We use this measure as a proxy for bureaucratic capacity or “the ability to accomplish intended actions” (Huber & McCarty, 2004, p. 481). Bureaucratic capacity may be low for a variety of reasons, including the lack of personal capacity of staff, the breakdown and instability of organizational structures, and the presence of incentives for corruption (Huber & McCarty, 2004, p. 481). Importantly, as Huber and McCarty (2004) point out, reductions in capacity not only reduce the general quality of policy-making via straightforward efficiency loss but also diminish the incentives of civil servants to comply with legislation. Both are relevant for our outcome of interest—the quality of regulatory decision-making. Bureaucratic capacity is a feature of the system as a whole, but it is hypothesized to affect all parts of the bureaucracy, including the competition authority. Hence, we may say that competition authorities “inherit” a base level of capacity from the rest of the bureaucratic system. To construct the measure, we use data from the World Bank’s Worldwide Governance Indicators (Kaufmann et al., 2010). Specifically, we use information concerning the government effectiveness component, which captures
perceptions of the quality of public services, the quality of the civil service and the degree of its independence from political pressures, the quality of policy formulation and implementation, and the credibility of the government’s commitment to such policies. (Kaufmann et al., 2010, p. 4)
The World Bank’s measure is based on 15 separate sources. We are confident that we can use a measure of the effectiveness of public services to explain the quality of a particular public service, because none of the sources specifically ask about competition policy, and those sources which do ask about effectiveness in specific areas ask about policy areas which are very different to competition policy, such as education, health care, transportation, and utilities. Missing values for a particular agency-year observation were carried forward (backward) from the last (next) agency-year with nonmissing information.
Sixth, and finally, we include a dummy variable which takes on the value of 1 if the agency operates in an EU member state. We include this variable because EU member states are exposed to a form of horizontal accountability through their relationship with DG Competition. Also, they benefit from exchange of best practice through their membership in the European Competition Network (Maggetti & Gilardi, 2014). We expect this to lead to better performance of competition authorities working in EU member states.
Analysis
In the previous section, we discussed the ratings provided by the GCR. We consider the overall rankings ordinal rather than interval data, such that although four stars is better than three stars, the difference between four and three stars may or may not be the same as the difference between three and two stars. We therefore use a ordinal probit regression model.
In our case, our observations—regulator years—are not independent of one another. Rather, we consider observations as nested within countries, and include random intercepts for each country to model this. Although the observations are ordered, we do not explicitly model the dynamics of these ratings.
11
We do, however, include year fixed effects in some models to allow for the possibility that the GCR became harsher or more lenient in different years. Formally, we posit an underlying latent level of quality,
The latent level of quality is related to the observed rating
The model is identified by setting σ2 to 1, and by omitting an intercept (Jackman, 2009), and is estimated using the ordinal package for R.
Table 5 shows the results of six different models. Models 1 and 2 show just the effects of independence and accountability, net of any controls. Models 3 and 4 include our control variables. Finally, Models 5 and 6 include an interaction between independence and accountability. Models 2, 4, and 6 are different from Models 1, 3, and 5 in that they include year fixed effects (not shown). These year fixed effects are included to allow for the possibility that the GCR might have become more or less generous with its ratings over time. The values of the coefficients represent the change in the latent level of quality associated with a one-unit increase in the relevant independent variable.
Regression Model Results.
GCR = Global Competition Review.
p < .05. **p < .01. ***p < .001.
Before discussing the effects of particular variables, we will look at the fit of the different models. The fit of models with year fixed effects is better, but not significantly so: a log-likelihood ratio test shows that Model 4 does not fit significantly better than Model 3 (7.16 on 10 df, p = .71), and Model 6 does not fit significantly better than Model 5 (6.57 on 10 df, p = .77). There is therefore no evidence of grade inflation (or deflation) over the years.
The fit of the models with an interaction term is also not significantly better than the fit of models without an interaction term. For the models without year fixed effects, Model 5 does not fit significantly better than Model 3 (2.55 on 1 df, p = .11). Thus, although the coefficient has the expected sign, we cannot confirm Hypothesis 3 (that the effects of independence are greater given higher levels of accountability).
We now turn to the interpretation of the findings. In short, the effects of independence are positive and highly significant. However, contrary to expectations, a more extensive set of accountability provisions is associated with lower rather than higher quality of work, though the effect is not always significant. Concerning other agency features, we find a highly significant effect of size: bigger agencies are better-rated. The same is true of agencies with powers to do more things. Interestingly, we find no effect of experience nor do we find any benefit to specialization, contra Dewatripont et al. (1999). Concerning country characteristics, we find a highly significant effect of general government effectiveness, and a positive effect of EU membership.
Let us now have a closer look at the findings. Hypothesis 1, concerning formal political independence, is confirmed: higher levels of formal political independence are associated with higher quality of work. 12 Although our quantitative analysis does not allow us to explore the causal mechanism empirically, we expect the effect to work via the increase in expertise that associated with higher levels of independence, via the increase in consistency associated with the variable, or via both intervening variables. Importantly, our findings suggest that there are good reasons to justify political independence by reference to better work.
The findings on political accountability raise more complicated questions. Hypothesis 2, which referred to a positive effect of accountability on quality of work, is not confirmed. If anything, there is an effect in the opposite direction, though it is not significant in all model specifications. 13 That is, increases in the type of accountability that we found in the design of competition authorities are not associated with higher quality of work. The question is whether this is a more general problem of accountability or a problem specific to the provisions we found in the statutes of the agencies. On one hand, as we pointed out in the section “Accountability and Regulatory Quality,” accountability provisions introduce reporting procedures which add to the workload of agencies. Indeed, the literature has linked accountability to red tape and excessive costs. On the other hand, previous studies have associated the negative effects with a specific category of provisions: provisions that are generic in nature, including requirements to produce annual plans and reports (Bovens & Schillemans, 2014). These are exactly the kind of provisions that we found in the design of competition authorities. Moreover, we only look at one type of relationship when it comes to accountability: the accountability of competition authorities to politicians. The negative effect of accountability may be relevant only for provisions for political accountability. Even if such provisions come into effect after regulatory decisions are taken—thus differing from political independence—agencies may anticipate the preferences of their political account-holders, leading to the reintroduction of politics via the back door. This is, for instance, why Majone (1999) advocates accountability to other actors than politicians. It might also account for the absence of an interaction effect of independence and accountability. All in all, we need to be careful when it comes to the interpretation of the accountability findings: There are reasons to believe they may be specific to the generic and political accountability that we looked at. 14
The role of EU membership also deserves some more reflection. We included EU membership because previous studies referred to the relevance of authorities’ coordination with the DG Competition and other competition authorities in the context of the European Competition Network. Such coordination is primarily aimed at ensuring that cases are allocated among national competition authorities, and that case-relevant information is being exchanged. This became crucial after the introduction of Regulation 1/2003, which increased the scope of the responsibilities of national competition authorities to also include trade between EU member states; not only trade within their own member state. Yet, while the different actors, including the DG Competition, do not have any formal power over one another, the coordination has resulted in an increase in horizontal accountability, learning from best practices, and even adoption of “soft law” (Maggetti & Gilardi, 2014). Hence, supranational coordination has had an effect of national policy-making (Maggetti & Gilardi, 2014). Although our analysis cannot tell us what the causal mechanism is exactly, it suggests that the coordination has affected not only the type of policies introduced at the national level but also the quality of regulatory decision-making, which has improved.
Because the coefficients in Table 5 represent shifts on a latent scale, and because the latent scale is hardly intuitive, the substantive importance of the coefficients can be put into context by considering their effect on the ratings awarded by the GCR. Here, we consider the effects shown in our preferred model, Model 3. 15
Figure 1 shows the effect of specified changes on the probability of receiving a ranking of four stars or more, when all other variables which might affect rankings are held at their mean. 16 When all variables are set to their mean, the probability of receiving a four-star ranking or better is relatively low, at 20%.

Effect of specified changes on GCR rating.
When we move independence from the mean to the top quintile, the probability of a four-star ranking doubles, to just over 40%. The distribution of first differences is skewed, and in many simulations the effect of an increase in independence is much greater. When we make a comparable change to accountability, by moving it to the top quintile, the probability of a four-star ranking decreases, to 11.6%. The magnitude of this change is smaller, despite the larger coefficient. This is because the changes in independence move the agency toward the steepest part of the curve of the probit function, pushing the agency over the four-star threshold, whereas changes in accountability move the agency away from this threshold, and make a low-probability event less likely. The effects of a change in powers are more similar to the effects of a change in independence.
Somewhat surprisingly, the effects of these design changes are greater than the effect of adding 100 more staff to the agency. The distribution of staffing levels is very much skewed as a result of some very large authorities, which means that a change of 100 staff is actually smaller than a change of 1 standard deviation (155 staff). Nevertheless, this change is a very considerable one given that fully half of our observations have fewer than 100 staff, and some agencies get by with as few as 17 competition staff (the case of the Austrian Cartel Office in 2004). 17
The effect of EU membership is comparable in magnitude with the effect of an increase in independence. In practice, the (true, unknown) effect of EU membership is likely to be greater if membership “causes” countries to increase the level of independence given to their competition authority.
Conclusion and Discussion
This study has aimed to assess the impact of two much-discussed institutional features—formal political independence and accountability—on the quality of regulatory decision-making in competition authorities. Using the GCR ratings of the performance of competition authorities in all OECD member states, and controlling for factors such as the number of staff, the experience of staff, the powers of the agency, and government effectiveness, our findings suggest that independence has a positive and significant effect on the quality of regulatory policy-making. The same is not true of accountability, which does not boost quality, and which does not seem to mediate the effects of independence.
By focusing on the effects of institutional design on the quality of regulatory decision-making, this study has moved beyond the existing literature, which has largely concentrated on the determinants of institutional design rather than their effects. We have been able to take this step thanks to the use of GCR competition authority ratings. This proxy measure allows us to capture a wide range of (process-related) elements of the multidimensional concept of performance, and is therefore much less subject to the biases resulting from capturing some elements of performance rather than others (cf. Bevan & Hood, 2006), and to the difficulty of disentangling the effect of organizational features when using outcome-based measures.
Our findings bear out the claim of the early literature on delegation to independent agencies. Institutions which are more independent tend to do better quality work than institutions which are less independent. The effect may be mediated by expertise, policy consistency, or both. The question of whether this improved performance sufficiently justifies removing competition policy from the scope of normal democratic politics is, of course, a normative question. But the presumption held by those in favor of delegation has now received stronger empirical backing.
The findings with respect to accountability are less straightforward to interpret. In our study, we made a distinction between independence and accountability. We did so both on the basis of conceptual arguments, and on the basis of previous literature which had suggested that the designers of IRAs could “have their cake and eat it”—that they could design for high independence and high accountability. Had we found a positive effect of accountability on quality, then regulatory agencies would have been able to live in the best of all possible worlds, equipped with high independence, high accountability, and producing work of high quality. Instead, our findings point to some trade-off between formal political accountability and performance, though the negative effect of accountability is not always significant. At the very least, the accountability provisions imposed on competition authorities do not seem to help the organizations.
As discussed in the “Analysis” section, the question that our study cannot answer is whether this is a more general problem of accountability or a problem of accountability design. Scholars do not unequivocally applaud accountability; they also associate it with red tape and excessive costs, particularly in the case of standardized and routine requirements (Bovens & Schillemans, 2014). Indeed, the accountability arrangements that we found in the statutes of competition authorities fall into the latter category, including standardized provisions on annual budgets, reports, and financial accounts. Moreover, we have solely captured accountability to politicians, while there may be reasons to believe that such accountability reintroduces politics via the back door (cf. Majone, 1999). Our findings may, therefore, support arguments about the sensitivity of accountability design; at least, we have learned that introducing a set of generic requirements for political accountability does not have desirable effects. Future research should pay much more attention to different types of accountability, and potential differential effects. Is there a difference between the effect of generic and more specific accountability provisions? How do differences in sanctions linked to accountability matter? And does the effect of political accountability differ from the effect of accountability to other parties? Addressing these types of questions may lead us to find out whether designing accountability more carefully can make a difference.
To the extent that our findings support arguments linking accountability to red tape and excessive costs, they raise the question of how desirable accountability really is. In our view, accountability serves more purposes than increasing the quality of regulatory decision-making. In particular, political accountability is crucial from a perspective of democratic legitimacy. Regulatory agencies make policies and policy decisions which affect the economy and society as a whole, and they do so with a fair amount of discretion. It is precisely under these conditions that democratic legitimacy becomes relevant. If we want to ensure that regulatory policies and decisions are still one way or another linked to citizens, we may want to introduce accountability provisions even if they come at a cost. Nonetheless, if the effect of accountability on regulatory quality is dependent on the form that it takes, there is all reason to invest in analyzing and improving the design.
We also stressed that our findings apply to de jure rather than de facto independence and accountability. Agencies may have low de jure accountability, in that they are not compelled by legislation to report back on their activities, but may have high de facto accountability, in that they provide such information of their own initiative. It is not clear from our work whether such “voluntary accountability” (Koop, 2014) on the part of the agency would harm performance. Future research into the links between de facto independence, accountability. and quality will require in-depth examination of agencies’ efforts to make themselves accountable, and in particular whether agencies’ accountability obligations are discharged in a formalistic manner, or whether instead they provide an insight into the decision-making process of the agency and, thereby, foster greater predictability and certainty for market operators.
Finally, we started our research with an interest in the quality of decision-making by IRAs. Our analysis raises the question to what extent the findings can “travel” to other types of regulatory agencies. In our view, there is no reason to think that the findings are competition authority-specific. The arguments linking political independence to better performance are relevant for all regulatory agencies; not just for competition authorities. Equally, all arguments related to accountability—in the theoretical section as well in the discussion of the results—are more generic. There is one key difference, though: While competition authorities regulate (specific aspects of) the economy as a whole, most other agencies regulate specific sectors. This increases the importance of analyzing the independence of the regulator from the regulatory sector. While regulators can benefit from interacting with their regulatees—particularly, in terms of information and expertise—they may also be captured by them, leading to a decrease in the quality of decision-making. As such capture is much more likely in sectoral regulation, independence from the sector may need to be taken into account in studies of the quality of decision-making by sectoral regulators.
Footnotes
Acknowledgements
We have greatly benefited from the comments and suggestions offered by seminar participants at the University of Exeter, King’s College London, the Centre for Competition Policy at the University of East Anglia, the 2014 conference of the European Consortium for Political Research (ECPR) Standing Group on Regulatory Governance, and the 2014 conference on “Consequences of Multilevel Governance” held at the Hanse-Wissenschaftskolleg in Delmenhorst. We would also like to thank Antje Kreutzman-Gallasch for help with gathering data on regulators’ statutes, and the Centre for Competition Policy at University of East Anglia for funding the data collection. Finally, we would like to thank the anonymous reviewers, whose comments strengthened the article a great deal.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Notes
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
