Abstract
Governments often fulfill election pledges to remain in power; yet, it is unclear how pledge fulfillment and breakage actually affect public support for government. This article explores the tendency for governments to be penalized for unfulfilled pledges more than they are rewarded for fulfilled pledges. In two large-scale highly realistic online survey experiments (N = 13,000, 10,000), performed at the beginning and middle of a government’s term in office, respondents are presented with a range of (real) election pledges. We find that broken pledges often are more important to government evaluations than fulfilled pledges, and that pledge fulfillment can produce decreases in support from nonsupporters that more than offset the marginal gains among supporters. Findings provide valuable evidence on asymmetries in political behavior, and a unique account of the “cost of ruling,” the seemingly inevitable tendency for governments to lose support during their time in office.
A central theme in representative democratic thought is that electorates will reward governments that perform well, and punish governments that perform poorly. The classic account of this reward-and-punish dynamic assumes that incumbents’ successes and failures are assessed symmetrically. There is, nevertheless, a large body of work suggesting that governments may be penalized for failure more strongly than they are rewarded for success (e.g., Lau, 1985; Soroka, 2014). Indeed, it often appears as if governments are destined to experience decreasing popularity during their time in office, more or less regardless of their performance. This widely acknowledged tendency is sometimes referred to as the “cost of ruling” (e.g., Nannestad & Paldam, 2002; Paldam & Skott, 1995; Palmer & Whitten, 2002; Stevenson, 2002; Wlezien, 2017).
The resulting aggregate-level dynamic has received a good deal of attention, but thus far, few contributions have directly exposed the individual-level factors contributing to what we refer to here as “asymmetric accountability.” This is the aim of the present article, drawing in particular on previous research on election pledges, that is, promises made during electoral campaigns. The extent to which governments fulfill these pledges is often regarded as an important measure of their democratic performance; but, in this instance, a focus on campaign pledges also offers an opportunity to study a finite set of commitments made by a government, for which that government may be rewarded or penalized by citizens. The central question below, then, is whether broken pledges are more important to individuals’ evaluations of government performance than fulfilled pledges. Narrowly viewed, the results speak to the way in which the fulfillment of campaign promises matter for government evaluations (see Elinder, Jordahl, & Poutvaara, 2015; Thomson, 2011). More broadly viewed, they add to the literature that seeks to understand why governments tend to see only very marginal rewards, if any, for achieving positive (i.e., promised) outcomes during their reign, and provide a unique perspective on why governments tend to lose support over time.
The focus of our analysis is an online, survey-based experimental manipulation in which 13,000 Swedish respondents receive one of 10 fulfilled or unfulfilled pledges from the currently governing Social Democrats (vs. a null treatment). The experiment allows us to compare effects of fulfilled and broken pledges on government evaluations. Results suggest that respondents’ evaluations of the governing party are negatively affected by information about a broken pledge. Information about a fulfilled pledge does not lead to a marked improvement of evaluations, however. What is more, when citizens disagree with a fulfilled pledge, they punish the governing party accordingly. The result of all this can be a net decrease in government approval, even when a government fulfills its pledge, at least among voters who are not core supporters already. The experiment is replicated, with several adjustments, 16 months later.
We focus on the design and results of the experiments below, and interpret them in light of past work on government accountability, on election pledges, and negativity biases in political behavior. First, however, we introduce and review the related literatures. Doing so makes clear the ubiquity of—and the consequent importance of understanding—the tendency toward “asymmetric accountability.”
Negativity Biases, the “Cost of Ruling,” and Asymmetric Accountability
Our work is motivated by two related literatures: one on “negativity biases” in political behavior and the other on the “cost of ruling.” Both point to the possibility that government assessments will be affected more strongly by failures than successes.
There is a long and growing body of work suggesting an asymmetry between electoral rewards and electoral punishments. Classic contributions (Campbell, Converse, Miller, & Stokes, 1960; Key, 1966) already suggested that punishment was more probable than reward. The theme persists in more recent work on negativity in political behavior and perceptions. This argument is related to a large body of scholarship—across the social sciences—on negativity biases, that is, the human tendency to give greater weight to negative information than to positive information. Psychology research on negativity biases in impression formation (e.g., Feldman, 1966; Fiske, 1980; Skowronski & Carlston, 1989) has been especially influential, and has been fruitfully applied in work on political leaders as well (e.g., Goren, 2007; Klein, 1991; Lau, 1982, 1985). Negativity biases have also been observed in studies on economic voting (e.g., Bloom & Douglas Price, 1975; Claggett, 1986), and in attention and reactions to media coverage (e.g., Soroka, 2006; Trussler & Soroka, 2014). They are also reflected in work on “problem avoidance” or “blame avoidance” in policy making (e.g., Hood, 2010; Weaver, 1986), which suggest that governments and bureaucracies are regularly immobilized by the realization that policy change more frequently comes with political losses than gains. All this work points to the tendency for government evaluations to be more strongly affected by negative information than by positive information.
This dynamic has also been apparent in work on the “cost of ruling,” focused on the observation that incumbent parties tend to lose votes over time, seemingly regardless of their performance. Indeed, Cuzan (2015) includes “the law of shrinking support” in his Five Laws of Politics. 1 Several different accounts for the cost of ruling are relevant to our ideas about voters’ tendency toward asymmetric accountability: examples include the “median-gap” hypothesis, in which parties never quite move to the median voter position, leaving a majority of voters unsatisfied (e.g., Paldam & Skott, 1995; Palmer & Whitten, 2002; Stevenson, 2002); a related “policy misrepresentation” hypothesis, in which the erosion of public support is the product of steadily increasing misrepresentation of the public’s preferences for policy (Wlezien, 2017); the “honeymoon effect,” in which very high expectations give way to disappointment over time; the idea of a “coalition of minorities,” in which separate minority groupings coalesce to become an increasingly large body of critics (e.g., Mueller, 1970; Stimson, 1976); and Nannestad and Paldam’s (2002) “grievance asymmetry,” in which citizens tend to weigh negative outcomes more heavily than positive outcomes. Each of these strands in the literature informs the account that follows, that is, an account in which government evaluation is an asymmetric process in which breakage of election pledges matters more than fulfillment of election pledges.
Research on Election Pledges
Election pledges hold a prominent place in classic accounts of representative democracy—accounts in which political parties make clear pledges before elections, voters use these pledges to make their decision at the ballot box, and voters then either reward governments for living up to their commitments, or “vote the rascals out” if pledges are broken. This is typically referred to as the “mandate model,” or the “responsible party model” (e.g., Budge & Newton, 1997; Dahl, 1991; Downs, 1957; Klingemann, Hofferbert, & Budge, 1994; McDonald & Budge, 2005; Powell, 2000). 2 The expectation that voters punish governing parties for broken pledges while rewarding them for fulfilled pledges incentivizes governments to take their promises seriously.
Election pledges do not just have theoretical importance—recent work shows that they are actively considered by parties, representatives, journalists, and citizens. Parties and representatives often claim or perceive themselves to hold a mandate to carry out their election platforms (Grossback, Peterson, & Stimson, 2005); the fulfillment of election pledges is a common topic in political debates around the world; and, parties have been shown to make a large and, over time, increasing, number of election pledges in campaigns (see, for example, Artés, 2013; Håkansson & Naurin, 2016; Mansergh & Thomson, 2007). Moreover, parties’ election programs and pledges get considerable media attention (Costello & Thomson, 2008; Kostadinova, 2017; Krukones, 1984). Although there are few systematic studies on how specific pledges are decided upon by parties (Däubler, 2012; Harmel, 2018; Scarrow, Webb, & Farrell, 2000), there is a large field on official party programs strongly emphasizing manifestos’ significance for parties and campaigns (Budge & McDonald, 2006; Budge, Robertson, & Hearl, 1987; Klingemann et al., 1994).
There is also a growing body of research showing a correspondence between pledges made in election manifestos and governments’ subsequent policy decisions (for overviews and comparative analyses, see Thomson et al., 2017; Naurin, Royed, & Thomson 2019). This work suggests that pledges made in election manifestos are not just ploys to get votes—they regularly reflect the actual policy intentions of prospective governing parties. This is true for a variety of political systems, but with higher fulfillment for single-party governments compared with coalition governments (which we have reason to come back to when describing our research design).
Where citizens are concerned, recent work indicates that voters see campaign pledges as useful when deciding whom to vote for (Born, van Eck, & Johannesson, 2018; Elinder et al., 2015; Johnson & Ryu, 2010), and that voters often are able to distinguish between broken and fulfilled pledges when they are specifically asked to do so (Naurin & Oscarsson, 2017; Thomson, 2011, see Pétry & Duval, 2017). Research in this area has been particularly influential for the current project. One observation in the literature is that citizens believe that most campaign pledges are broken, even as research finds that a majority of campaign pledges are acted upon. We believe that this “pledge puzzle” (Naurin, 2011) is related, at least in part, to the tendency for citizens to attach greater weight to policy failures than to policy successes.
Taken together, even as mandate theories of democracy may be unrealistic in some ways, 3 election pledges matter to everyday politics. Our work begins, then, with the recognition that election pledges are of significance, sometimes limited, but also sometimes central, for political representation and for government accountability. We also note that if governing parties are in fact punished for their “wrongs” (such as not fulfilling their promises to the electorate), and not really rewarded for their “rights” (such as fulfilling their promises), then it makes sense that government parties obtain worse evaluations the longer they have been in office. That being said, we make an additional observation: One citizen’s “bad news” comes not just in the form of supported pledges that are not fulfilled but also in the form of disliked pledges that are in fact fulfilled. Put differently, the notion that a government fulfilling its pledges should be rewarded, and that a government breaking its pledges should be punished, is too simplistic. Whether an individual will view the fulfillment or breakage of a pledge as a positive or negative outcome is conditional on his or her political preferences, which is why we move below to the expected moderating influence of “policy consistency.” 4
The Moderating Influence of Policy Consistency
The impact of pledges, fulfilled and unfulfilled, is likely moderated by some combination of (a) the extent to which a pledge is consistent with an individual’s policy preferences and (b) partisanship (independent of pledge contents).
Regarding the former, the effects of fulfillment/breakage on government evaluations should depend on whether or not the policy is desired (by the individual) to begin with. A broken pledge can be grounds for disappointment over the government’s untrustworthiness, but it can also lead to disappointment stemming from the fact that one’s desired policy is not implemented. In a similar way, a fulfilled pledge should lead to disappointment if the individual does not welcome the policy proposal. In this respect, policy preferences could potentially overrule partisan identities, with individuals rewarding nonpreferred or even disliked parties for implementing an appealing policy; similarly, individuals may inflict stronger punishments on any party breaking a preferred pledge.
Apart from the expectations regarding policy consistency, there are reasons to expect that partisanship matters for how fulfillment and breakage of pledges affect evaluations of governments, irrespectively of the specific policy. Most important, those who voted for the governing party should have higher evaluations ceteris paribus than should those who voted otherwise. Copartisans’ evaluations may be less affected by information about pledge fulfillment/breakage as well.
There is of course a considerable body of literature on partisan bias. Perhaps most relevant for the present study is Fiorina’s (1981) work highlighting the retrospective aspect of partisan biases, in which past performances of incumbent parties are observed through a party-colored lens. In short, similar events/performances can be perceived differently by different individuals based on their party identification and/or previous voting behavior. 5 More recent contributions also underline the connection between partisan biases and negativity biases, which are seen as intertwined in the sense that positive information matters most if it concerns the party one identifies with, and negative information matters most if its subject is a party one does not identify with (e.g., Goren, 2002, 2007). 6
To sum up our expectations, we expect citizens to evaluate their government asymmetrically, in the sense that, they will penalize more for negative outcomes than they reward for positive outcomes. We note that “negative” outcomes with regard to election pledges can come in different forms, depending on an individual’s partisanship and policy preferences. We, thus, analyze whether pledge breakage matters more than pledge fulfillment in general, as well as whether breakage and fulfillment of supported versus unsupported pledges matters differently. We capture pledge support both directly, and through partisanship; and here too, we expect negative outcomes to matter more than positive outcomes.
Put more formally, we have the following expectations:
Research Design
Our data come from two online survey experiments with 13,000 and 10,000 respondents, respectively, from the Swedish Citizen Panel of the Laboratory of Opinion Research (LORE). 7 The experiments were performed during a minority coalition government consisting of the large party the Social Democrats and the smaller Green Party. Note that minority governments are common in Sweden. Between 1944 and 2014, as many as 73% of the governments were minority governments, mostly in the form of Social Democratic single-party minority governments (Lindvall et al., 2017). The most recent coalition governments have been coherent and based on joint election manifestos (with the so-called Alliance for Sweden, see Aylott & Bolin, 2015). Although accountability processes often are expected to be more complicated under minority and coalition governments (see, for example, Anderson, 2000; Powell & Whitten, 1993; Tavits, 2007), the Swedish system has often provided the voters with a fairly obvious party to blame or credit by the end of the election period. Recent work on accountability processes in Sweden supports the perception that voters are fairly enlightened when evaluating government performance (Healy, Persson, & Snowberg, 2017; Persson & Martinsson, 2018). It, nevertheless, is important to take the particular context in which we run our experiments into account; we, thus, revisit the generalizability of our findings in the “Concluding Discussion” section.
All respondents were presented with one pledge out of a selection of pledges. All pledges were taken from the election manifesto of the Social Democrats in the election 2014, or from publicly announced pledges made by Prime Minister Stefan Löfven prior to the election. The use of real-world pledges increases the external validity of the experiment. These are all actual pledges, which actual Swedish voters are responding to; we, thus, expect that the reactions identified below are very close to what we might expect in reaction to news about the Löfven Government’s fulfilled or unfulfilled pledges. This external validity comes at a cost where internal validity is concerned, however. Most important, we are not able to hold all other aspects of the pledges constant while varying fulfillment. Pledges will vary in terms of not only subject matter but also political importance, financial cost, public salience, and so forth. There is also a risk of “pretreatment” (Druckman & Leeper, 2012) if some of the pledges were to be disproportionally covered in the media prior to the survey. We minimize the impact of these other variables by using five different fulfilled pledges and five different unfulfilled pledges, and compare the set of fulfilled treatments with the set of unfulfilled treatments. Indeed, we achieve a good degree of balance across sets by selecting one fulfilled and one unfulfilled pledge from each of five policy domains: tax reductions, labor market, infrastructure and transportation, macroeconomy, and education. Constraints on our choice of subject area and pledges were posed by the actual pledges that had been made by the Social Democrats, and by the fact that the pledges had to be either clearly fulfilled or clearly broken (even as the current government had been in power for just less than a year). (More information on pledges used in both this and the following experiment is available in the supplemental appendix.) We, thus, have good reason to believe that our results are not driven by a single topic, or by other pledge-specific quantities that we cannot easily control in our highly realistic (indeed, absolutely realistic) treatments. We also expect pretreatment effects are minimized through the use of multiple pledges, of varying salience.
It is of course worth noting that we study a small selection of all pledges made in the 2014 election campaign, especially when taking into account all possible pledges as perceived by the voters. We argue that these pledges are important and obvious, in the sense that, they are included in the official election manifesto or clearly outspoken by the head of the party. Furthermore, they fit the common scholarly definition of election pledges (Thomson et al., 2017)—most important, avoiding broad valence issues on which most people agree (albeit to varying degrees).
The treatments themselves were as follows (the original Swedish text can be found in Supplemental Appendix 2): “Before the 2014 election, the Social Democrats promised to . . . Are you aware that the Social Democrats will break/have fulfilled this promise?” 8 The given alternatives were as follows: (a) “Yes, I am aware that the Social Democrats will break/have fulfilled this pledge” or (b) “No, I was not aware that the Social Democrats will break/have fulfilled this pledge.” The following pledges were used in the first experiment:
“Reduce the tax reduction for household services” (the “RUT-reduction,” fulfilled), “keep the tax reduction for rebuilding private housing unchanged” (the “ROT-reduction,” unfulfilled).9,10
“Improve unemployment benefits” (fulfilled), “forbid companies to hire temporary workers for permanent needs” (unfulfilled).
“Invest in the Swedish rail network” (fulfilled), “abstain from increasing gasoline excises” (unfulfilled).
“Increase state support of export and innovation” (fulfilled), “balance the public finances so that Sweden reaches the surplus goal” (unfulfilled).
“Increase funding of adult education” (fulfilled), “make high school compulsory until 18 years of age” (unfulfilled).
The second experiment, fielded roughly 16 months later, relies on a similar design, and has the advantage of having taken place much later in the Social Democrats’ tenure in government. In this instance, we use four policy domains, namely, social security, education, family policy, and infrastructure and transportation. Of the eight pledges, four are taken from the first experiment and four are newly selected, based on policy decisions taken since the first experiment. The eight pledges that were used for the second experiment are listed pairwise below; the newly added pledges are marked with an asterisk.
“Improve unemployment benefits” (fulfilled), “improve health care benefits”* (unfulfilled).
“Increase funding for adult education” (fulfilled), “make high school compulsory until 18 years of age” (unfulfilled).
“Reserve a third month of parental leave for each parent”* (fulfilled), “increase child care benefits with 100 Swedish kronor per child”* (unfulfilled).
“Invest in the Swedish rail network” (fulfilled), “nationalize the maintenance of the rail network”* (unfulfilled).
Both experiments included a null treatment group, in which respondents were given information unrelated to pledges and government performance: “Do you know that the Social Democrats have been in government since the election 2014?” Alternatives were as follows: (a) “Yes, I know that the Social Democrats are in government since the election 2014” and (b) “No, I did not know that the Social Democrats have been in government since the election 2014.” This formulation was chosen because it allows us to use the same question format, repeat the name “the Social Democrats,” and make reference to the 2014 election. We, thereby, hold constant all factors other than fulfillment/breakage of pledge. In the first experiment, we assigned treatments based on perfect randomization between 11 groups, where each of the 10 treatment groups have roughly 1,000 respondents, with roughly 3,000 respondents assigned to the null treatment. The second experiment had eight treatment groups with roughly 1,000 respondents each, and approximately 2,000 respondents assigned to the null treatment (balance checks are provided in Supplemental Appendix 1). 11
We do not focus on the respondents’ answers to the treatment questions; those questions were intended only to deliver the pledge information to the respondents. 12 Instead, we focus on the impact of each treatment on government evaluations, measured using “How well do you think that the Social Democrats have performed in government?” Answers were given on a 7-point scale, where 1 is very bad and 7 is very good. This question was asked on the screen immediately following the aforementioned treatments.
Government evaluations are our dependent variables, then; and moderating variables (to test Hypothesis 2) include two different partisanship-based measures. The first is a binary variable, coded 1 for respondents who voted Social Democrat in the last election. Note that this vote variable is from a previous wave in the panel (one immediately proximate to the 2014 election), and is, thus, entirely unaffected by our experimental treatments. The second captures the partisan consistency of individual pledges. The variable is equal to 1 for respondents who receive a treatment (pledge) that is clearly consistent with their partisanship, based on party support for each individual pledge, and equal to 0 otherwise. Partisan consistency is based on statements of support, or not, in all parties’ election programs from the 2014 election. There are a limited number of cases in which we cannot readily assign either 0 or 1 to respondents because the party they voted for did not take a clear stand either for or against the pledge they received; we exclude this limited number of cases from this analysis. Supplemental Appendix 2 includes the coding for all the pledges and parties on which this variable is based.
In the first experiment, partisan consistency is used as a proxy for the individual’s preference for the pledged policy. Another approach is to measure respondents’ policy support directly, and this was the principal objective of the second experiment. At the start of the survey, distant from the experiment, respondents were asked to indicate their opinion of a battery of five policy proposals. Four of these proposals were unrelated to the replicating experiment, whereas the fifth proposal contained the policy from the pledge the respondent would be treated with later. Respondents’ attitudes were measured as follows: “What is your opinion of the following proposals?” 13 Answers were recorded on a 5-point scale, where 1 was very bad proposal and 5 was very good proposal. We rely on these data to produce a measure of policy consistency, that is, the extent to which the treatment pledge is or is not in line with each respondents’ preference.
Note that we do not directly compare the impact of partisan consistency with the impact of policy consistency. The first experiment lacked the policy measure; and, 16 months later, the partisan consistency measure would have been less appropriate. As we move further away from the election, we expect that (a) parties’ positions on specific policies receive less attention and, thus, (b) it is harder for respondents to connect parties to specific election promises. Moreover, (c) parties change their positions over time, and indeed between our two experiments. For these reasons, we focus on partisan consistency in the first experiment, and policy consistency in the second.
Results
Experiment 1
Estimated treatment effects from the first experiment are displayed in Table 1. Model 1 shows the results of an ordinary least squares (OLS) regression of Social Democratic evaluations (1-7) on either the unfulfilled or fulfilled treatments (in comparison with the null treatment). 14 The unfulfilled treatment leads to an average decrease in Social Democratic evaluations of roughly −0.12; the fulfilled treatment, in contrast, has no significant effect on evaluations. Already, there is evidence of asymmetric accountability: The Social Democrats are penalized for not fulfilling a pledge, and receive no increase in evaluations for fulfilling one.
Results, Experiment 1.
Cells contain OLS regression coefficients with standard errors in parentheses. OLS = ordinary least squares.
p < .05. **p < .01. ***p < .001.
These effects are clearer in Figure 1, which shows estimated Social Democratic evaluations across each treatment based on Model 1. The estimated government evaluation for those who received the null treatment is shown as a gray dashed line; the estimated evaluation for each treatment is shown as black squares (with bars showing margins of error). The figure illustrates that respondents give more weight to pledge breakage than to pledge fulfillment. It also makes clear that fulfilled treatments do not lead to an increase in evaluations, if anything, they lead to a decrease (although the coefficient for the fulfilled treatment narrowly misses statistical significance).

The impact of unfulfilled versus fulfilled promises, Experiment 1.
Are treatment effects moderated by partisanship, or the partisan consistency of pledges? Regarding partisanship, the answer seems to be no: Preliminary models suggest that although there is a direct and sizable (positive) impact of Social Democratic voting on government evaluations, this variable has no moderating impact on the experimental treatments. We, consequently, include only the direct effect of this variable in Models 2 and 3 of Table 1. In Model 2, we see that being a Social Democratic voter is associated with a roughly 2-point increase in evaluations. This is of course as we should expect.
Partisan consistency does moderate the impact of pledge fulfillment, however. These results are shown in Model 3, which, in addition to including the direct effect of Social Democratic voting, includes the direct and moderating effects of partisan consistency. Note that partisan consistency varies only for respondents who are not in the null treatment, so the coefficient for partisan consistency captures its impact for the unfulfilled treatment, and the interaction between partisan consistency and unfulfilled captures the difference in the impact of partisan consistency across the unfulfilled versus fulfilled treatments. (Similarly, given the interaction between unfulfilled and consistent, the direct effect of fulfilled captures the impact of fulfillment for inconsistent pledges.) This concoction of direct and interactive effects is difficult to interpret from the coefficients alone, so the effects are illustrated in Figure 2.

The impact of unfulfilled versus fulfilled promises, moderated by partisan consistency, Experiment 1.
Mean evaluations under the null condition are again shown as a dashed line, and the y axis is the same range as Figure 1, to facilitate a direct comparison. Differences in the impact of pledge fulfillment is shown as hollow squares for respondents presented with a pledge that is consistent with their partisanship, and shown as filled-in squares for respondents presented with a pledge that is inconsistent with their partisanship. In the partisanship-consistent condition, there is an advantage to pledge fulfillment—evaluations shift upward from roughly 2.9 to 3.1. In the partisanship-inconsistent condition, the impact of fulfillment is negative. Fulfilling a pledge that is partisanship inconsistent results in a downward shift from roughly 2.9 to 2.6. Taken as a whole, the improved evaluations from the partisanship-consistent group are, thus, offset, and quite possibly overwhelmed, by the decreased evaluations from the partisanship-inconsistent group.
Results in Figure 2 help make sense of the overall results in Figure 1. Even among those who prefer a party that supports the pledge, the Social Democrats do not reap major gains from following through; they do suffer from breaking those pledges, however. And, those who identify with parties other than the Social Democrats are a much tougher group—they penalize heavily for the fulfillment of a pledge that they do not support. The net impact of successfully fulfilling a promise is a net decrease rather than a net increase in approval.
Experiment 1, thus, indicates asymmetry in all the three ways mentioned above: (a) not fulfilling a pledge is more damaging to evaluations than fulfilling a pledge is beneficial; (b) not fulfilling a pledge that is consistent with respondents’ preferences is damaging to evaluations, whereas not fulfilling a pledge that is inconsistent with respondents’ preferences has no discernable effect; and (c) fulfilling an inconsistent pledge is more damaging than fulfilling a consistent pledge is beneficial.
Experiment 2
Results from Experiment 1 are telling, but confidence in the generalizability of the results requires some additional testing. First, a replication 16 months later, roughly 2 years into the election period and using partly different pledges, confirms that the first experiment does not identify something unique about the first year in which a government is elected. Second, an additional experiment offers an opportunity to test a different measure of consistency, namely, policy consistency.
Results in Table 2 are based on exactly the same specification as in Experiment 1; and, predicted levels of support based on Model 1 are again plotted, now in Figure 3. Note that these results are nearly identical to what we saw in Experiment 1; indeed, besides slightly higher government evaluations in the null condition (the product of slightly more Social Democratic support in this sample), Figure 3 is barely distinguishable from Figure 1. We take this as evidence of the robustness of our initial findings: In comparison with a no-pledge condition, governments tend to lose support breaking pledges, and do not gain—and may even lose—support by fulfilling pledges.
Results, Experiment 2.
Cells contain OLS regression coefficients with standard errors in parentheses. OLS = ordinary least squares.
p < .05. **p < .01. ***p < .001.

The impact of unfulfilled versus fulfilled promises, Experiment 2.
What about the impact of policy consistency? As in Table 1, Table 2 includes models that include the impact of Social Democratic voting, alongside the interactive effects of pledge fulfillment and policy consistency. Model 3 suggests significant effects of both treatments, as well as direct and interactive effects of policy consistency. The impact of treatments, across policy consistent (hollow squares) and policy inconsistent (solid squares) are shown in Figure 4.

The impact of unfulfilled versus fulfilled promises, moderated by policy consistency, Experiment 2.
Results here are both similar and different from what we observed in Experiment 1. Where similarities are concerned, note first that the impact of a fulfilled versus unfulfilled pledge is moderated by consistency: Again, respondents presented with a pledge they support increase their evaluations with fulfillment, whereas respondents presented with a pledge they do not support decrease their evaluations with fulfillment. In comparison with the null condition, the effect of fulfilling a disliked pledge is markedly more negative than the effect of fulfilling a liked pledge is positive. This too is consistent with the preceding findings and supports the idea that there is an asymmetry in evaluations toward negative outcomes.
The central difference between these and previous results is that there is a considerable gap between the policy-consistent and policy-inconsistent groups under the unfulfilled condition, whereas these groups offered relatively similar evaluations under the unfulfilled condition in Experiment 1. We believe that this is a consequence of capturing policy-specific consistency directly, which reveals more clearly the fact that those who are opposed to the pledge in their treatment offer markedly lower government evaluations, regardless of whether the policy is fulfilled or not. A related result is that the change in evaluations among the policy-inconsistent group, as we move from unfulfilled to fulfilled, is roughly −0.2—opposite in sign but identical in magnitude to the increase in evaluations among the policy-consistent group. This is one situation in which we do not find clear evidence of an asymmetry in the impact of negative versus positive information. That said, the aggregate impact of these evaluations will depend on the balance of policy supporters and nonsupporters. We discuss this further in the next section.
The Role of Supporters and Nonsupporters
Note that regardless of strong asymmetries at the individual level, the degree to which pledge fulfillment hurts or hinders government evaluations will depend on the balance between partisans and nonpartisans, or policy supporters and nonsupporters. If there are a great many supporters, then even marginal individual-level benefits of fulfillment among supporters may outweigh the larger individual-level costs among nonsupporters. If supporters and nonsupporters are roughly balanced, then benefits among supporters will be washed out by costs among nonsupporters.
What is the tipping point? We can provide preliminary descriptive estimations on this from our experimental results. To be clear, we regard the following only as illustrative estimates. They are useful in interpreting the preceding results, but we readily acknowledge that strong descriptive inference depends on a highly representative sample. Our intention here is only to offer one way of thinking about the impact of our findings in the aggregate. Figure 5 shows estimated aggregate levels of government approval across a shifting balance of supporters and nonsupporters, based on results from Experiment 1, drawn from Model 3 of Table 1.

Aggregate government evaluations, based on Experiment 1.
The estimation for Figure 5 is relatively straightforward: We take estimated government evaluations for (a) a partisanship/policy-consistent respondent and (b) a partisanship/policy-inconsistent respondent, and then generate aggregate evaluations by assigning different proportions of a hypothetical population to the partisanship/policy-consistent and inconsistent groups. At the far left of the first panel, then, we see what aggregate government evaluations might look like in the (unlikely) instance that a pledge is fulfilled for which there are no supporters. At the far right of that panel, we see what aggregate evaluations would be if everyone supported the fulfilled pledge. The solid line shows government support under the fulfilled condition; the dashed line shows government support under the unfulfilled condition.
Not fulfilling a pledge is damaging to evaluations—it produces aggregate levels of support that always are below the null condition, and more so when the pledge has majority support. What is more striking is the impact of fulfilling a pledge, which is better than not fulfillment at about 60% public support, but better than the “no pledge” condition only at 80% public support. We regard this as a strikingly high level of support, in any policy domain. Few governing parties (and few policies) are supported by an overwhelming majority of citizens; and even where a governing party is supported by a majority of its citizens, large individual-level losses among out-partisans may outweigh small individual-level gains among copartisans. Only in very limited circumstances, then, are governments likely to reap a reward—at least where evaluations are concerned—from fulfilling a pledge.
Figure 6 presents a similar simulation, based on Experiment 2, and estimated using results from Model 3 of Table 2. Pledge support in this instance reflects policy rather than partisan consistency. The impact of being able to measure directly whether respondents support or do not support a pledge reveals a rather different trend within the unfulfilled condition: Even when a pledge is not fulfilled, evaluations improve when respondents see a policy they approve of. According to this model, fulfillment produces better aggregate evaluations than unfulfillment when policy support exceeds 50%. But again, fulfillment or not, neither pledge condition produces better evaluations until public support for the pledge exceeds 75%.

Aggregate government evaluations, based on Experiment 2.
Concluding Discussion
This article has investigated asymmetric accountability, that is, the tendency for negative outcomes to matter more to individuals’ general evaluation of government performance than positive outcomes. We have done so through two survey experiments on election pledge fulfillment, which take into account, in different ways, both partisanship and policy preferences.
Results confirm asymmetries in the costs and benefits of pledge breakage and pledge fulfillment. We find support for both Hypotheses 1 and 2: In short, the negative effect of pledge breakage is larger than the positive effect of pledge fulfillment. We also find that the fulfillment of unwanted pledges will tend to be more damaging than the fulfillment of wanted pledges is beneficial. This seems especially true in Experiment 1, where individuals react asymmetrically to unfulfillment and fulfillment. This dynamic is less apparent in Experiment 2, but even there it is clear that unfulfillment is, relatively to baseline, more costly than fulfillment is beneficial. And, in both cases, the mechanics of aggregate suggest that large majorities of the population must support a policy for government approval to increase. These findings are in line with our expectation that negative information will matter more than positive information for government evaluations.
This study has, thus, revealed a dynamic in government evaluations that has previously been rather opaque. Existing work already has observed the tendency for governments to be more frequently penalized than rewarded, of course. The current study reveals a dynamic in individuals’ assessments of governments that we believe helps account for this broader finding. In so doing, it highlights the fact that increasingly negative evaluations of governments over their time in office might have to do not only with disapproval of policy failure but also with disapproval of the policies that have been successfully adopted. These results are complementary to the “cost of ruling” literature, in that, they highlight a mechanism that helps explain why government evaluations become more negative over time. “Median-gap” and “policy misrepresentation” hypotheses (e.g., Nannestad & Paldam, 2002; Wlezien, 2017), in particular, are in line with our findings.
We note specifically that our study highlights the negative effects of fulfillment of unwanted pledges, as this has not received much attention in previous work on election pledge fulfillment. Although making a pledge can be beneficial (in that, it leads to increased evaluations among supporters before the election), it can also have costs after election among nonsupporters when it is fulfilled. This is not to say that there are no incentives for a government to fulfill a pledge. We show that there is a point at which fulfilling a pledge produces better aggregate evaluations than not fulfilling it. And, obviously, there are other ways in which a government is rewarded when fulfilling pledges: Representatives can care more about policy than evaluations/reelection, for instance. It may also be the case that there are benefits to being a party that fulfills pledges over time, a dynamic that would not be captured in our experiments. We, thus, should expect that once a pledge is made, a government will try to fulfill it; even though our study shows that doing so will not necessarily be beneficial, at least where government support is concerned.
We readily acknowledge that our studies focus on evaluations, and there is a need for further work exploring whether this dynamic carries over to votes. Our experiments were implemented in the beginning and in the midst of an election period. We, thus, see evaluations as more meaningful in this case; we also suspect that the absence of a campaign makes our study easier to interpret, insofar as our treatments are not vying with counterarguments in ongoing campaign communications. But connecting our findings more directly to voting behavior is one area for further work.
So too are replications of our experiments in systems with different party and electoral systems. Given the preceding results, we suspect that asymmetric accountability is a central element of government evaluations well beyond the context examined here. We cannot easily predict from these data whether the dynamics observed here will be stronger or weaker under different sets of representative institutions. That said, past work (especially Powell & Whitten, 1993, see also, for example, Anderson, 2000; Tavits, 2007) suggests that where “clarity of responsibility” is high, holding governments accountable is more straightforward. We accordingly suspect that institutions that increase clarity of responsibility—such a unitary (nonfederal) systems and/or majority single-party governments—will tend also to increase the magnitude of both rewards and penalties for pledge fulfillment. We also expect that the number of parties, the nature of party competition, and the degree to which any one party’s policies overlap with another’s will affect the balance of pro- and antipolicy preferences among the public, and, thus, the likelihood that any one pledge fulfillment will result in net gains or losses for government evaluations.
None of these expectations speaks to the relative impact of negative versus positive outcomes, except insofar as low clarity of responsibility (and/or issue salience) may result in no serious change in evaluations either upward or downward. Where there is accountability, then, we expect asymmetry, regardless of the design of representative democracy institutions. The extent to which this is true is, of course, a subject for future research.
Supplemental Material
CPS.Naurin.replicationfiles – Supplemental material for Asymmetric Accountability: An Experimental Investigation of Biases in Evaluations of Governments’ Election Pledges
Supplemental material, CPS.Naurin.replicationfiles for Asymmetric Accountability: An Experimental Investigation of Biases in Evaluations of Governments’ Election Pledges by Elin Naurin, Stuart Soroka and Niels Markwat in Comparative Political Studies
Supplemental Material
Online_Appendices_CPS – Supplemental material for Asymmetric Accountability: An Experimental Investigation of Biases in Evaluations of Governments’ Election Pledges
Supplemental material, Online_Appendices_CPS for Asymmetric Accountability: An Experimental Investigation of Biases in Evaluations of Governments’ Election Pledges by Elin Naurin, Stuart Soroka and Niels Markwat in Comparative Political Studies
Footnotes
Acknowledgements
The authors thank the anonymous reviewers of this piece, as well as all those providing feedback at various conference presentations. They are also grateful to the Laboratory of Opinion Research (LORE), University of Gothenburg. This is one of the projects financed by LORE through their open application process.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article. The study was performed within the realm of the authors’ research positions.
Notes
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
