Abstract
Theorists offer many predictions about how Americans will respond to significant cyberattacks but systematic evaluations of American public opinion regarding these issues remain rare. We present results from a conjoint experiment and find that the public supports retaliation-in-kind against cyberattacks but is willing to escalate as the economic damage and human casualties of a hypothetical attack mount. Respondents support harsher retaliation after attacks carried out by terrorist groups or state agencies rather than those conducted by individuals or civilian hackers. Finally, the dynamics of the public's judgment regarding responses to a domestic or an international cyber attack are broadly similar.
How will states respond to significant cyberattacks? 1 Some theorists contend that escalation should be indexed to severity as part of a strategy of deterrence. Kostyuk et al. (2018) argue that US responses to cyber incidents should depend upon factors such as whether the attack takes place during a stable peace or a period of near-war, the possibility of further escalation and unintentional damage, and most important, the identity and objective of the attacker. Others, like Farrell and Glaser (2017), argue that a focus on an attack's effects without considering normative factors will mislead analysts because “states [may] not view all equally damaging attacks as equal”. For still others, cyber operations should not be viewed in isolation but understood as part of actors’ broader strategic interactions. For this camp, cyber operations appear less useful for coercion (Borghard and Lonergan, 2017) and more useful as a means of signaling (Lonergan and Lonergan, 2022).
These theories involve conflicting assumptions about how audiences will evaluate a cyber provocation and how they will favor responding. They make claims, implicit or explicit, about how factors like the severity of a cyberattack, the identity and motivations of an actor carrying out the attack, and the target of the attack will affect audiences’ assessment of a situation and how to respond. Resolving such theoretical contests requires testing the hypotheses generated by these rival theories. Yet such systematic testing remains rare even though such a process may not entail crowning a winner but establishing conditions of possibility. Seemingly irreconcilable distinctions may result from mistaking some parts of a continuum of behavior as representative of a whole (Healey and Jervis, 2020). Empirical testing could thus establish conditions under which some mechanisms but not others predominate, thereby potentially helping reconcile at least some of these camps. Finally, the relationship between evaluations of an attack's severity and preferences for responses should be assessed, not assumed, over a broad range of attack conditions.
We contribute to this debate with a conjoint experiment to explore how American respondents evaluate cyberattacks and potential responses. Our results show that fatalities and economic impacts strongly influence both respondents' assessment of an attack's severity and respondents' support for retaliatory action. The type of target attacked influences how respondents judges an attack's severity but not respondents' support for retaliation. The characteristics of an attacker, particularly whether it is a state or terrorist group, can influence support for retaliation, with harsher measures favored against terrorists and governmental agencies than individuals or civilian perpetrators. Respondents modestly favor harsher retaliation against attacks from Iran and North Korea than from Russia, China, and Israel, although the country of attackers’ origin does not influence whether an attack is perceived as more or less severe. Finally, the dynamics of evaluation and response regarding international and domestic cyberattacks are similar.
Our research makes three contributions. First, our use of conjoint experimental methodology allows us to test theoretical conjectures in ways that earlier research could not. Specifically, we simultaneously test audiences’ evaluations of attack severity, preferences for retaliation, and the relationship between those factors across a wider range of conditions than previous work. By including both domestic and international perpetrators, our design removes potential obstacles to inference regarding the characteristics of an attack, including barriers caused by xenophobia or other factors related to international attacks but not domestic ones. This aspect of our design provides for greater clarity in our interpretations of causal relationships.
Second, our findings carry theoretical implications. Although in some sense our core finding—more severe attacks lead to greater support for stronger retaliation—could be viewed as unsurprising, it has so far been asserted rather than demonstrated. Furthermore, we demonstrate that there are conditions under which cyberattacks will produce more severe reactions than others. We also show something that is not obvious ex ante: that the factors that make an attack seem severe are not the same as those that prompt support for more severe retaliation (although lives lost and, second, economic impact dominate both relationships). Perhaps most important, our findings suggest that the de-escalatory or accommodative potential of cyber operations may be real but also bounded by the severity of an attack. As long as attacks were not particularly severe, respondents favored responses in kind (such as cyberattacks) or even slightly de-escalatory options (such as cyberespionage). As the severity of an attack rose, support for escalation (such as airstrikes) rose substantially.
Finally, we offer avenues for further empirical and theoretical explanations. Our findings depict a public that is potentially less likely to demand harsher retaliation for attacks against critical infrastructure (holding effects constant) than some have proposed while being more sensitive to the characteristics of the group or state carrying out the attack. Attempts to establish normative sanctions against targeting such infrastructure may thus face an uphill climb relative to those that simply seek to limit cyberattacks. It also suggests that the relevant context is not the target of a cyberattack but the context under which an attack is carried out.
Why public opinion matters for cyber policy
The stakes of cybersecurity policy involve blood, treasure, and escalation that could further jeopardize both. If public opinion affects factors such as the credibility of deterrent threats or the menu of responses from which officials can choose, then understanding public opinion regarding cyberattacks would be important. In this section, we review why public opinion matters for theorizing foreign policy and national security in general, and how researchers have addressed public opinion regarding cybersecurity in particular.
Why study public opinion about cyber operations
Some may doubt the necessity of understanding how the public thinks about cyber operations, either because of a general belief that the public does not hold much sway in foreign policy or because of a specific belief that cyber operations are too complex or technical for the public to have input. Nonetheless, there are several reasons to believe that public opinion about cyber operations warrants investigation.
First, public opinion creates incentives by which leaders make their threats credible (Kreps and Das, 2017; Tomz, 2007). Prominent work in the cyber literature adheres to the view that audiences beyond officialdom and their anticipated reactions can influence signaling and credibility (Borghard and Lonergan, 2017; Lonergan and Lonergan, 2022).
Second, public opinion may influence foreign policy decisions. A weaker, negative version of this argument holds that public opinion constrains elite actions in cyber operations (Schulzke, 2018). Recent work in cognate research areas suggests a stronger claim that officials are influenced directly by public opinion when considering the use of force (Chu and Recchia, 2022; Lin-Greenberg, 2021; Tomz et al., 2020). Gomez and Whyte (2021: 124) report that US officials have said during wargames that the severity of a initial response to a cyberattack would depend on the reaction of the American public. Such influence may not need to be observed to be real. If policymakers act according to a logic of “anticipated representation”, making the decisions the public would want if the public were informed about policy, then policy will be responsive to public opinion even without overt public participation (Arnold, 1992). Since policymakers and members of Congress are greedy to understand public opinion (Druckman and Jacobs, 2006), arguments that summarily dismiss the importance of that factor may be more academic than realistic.
Third, recent evidence suggests that even if elite opinion alone determines cyber policy—a position we reject—studying mass opinion may offer a valid means of understanding elite decision-making because elites may form opinions through processes that resemble those of the public (Kertzer, 2020).
Finally, greater information about how the public views security and foreign policy issues should be valued regardless of policy implications. As a subject of scientific inquiry, public opinion about topics like cybersecurity is worth understanding in its own right as a way of understanding how people understand politics (Berinsky and Kinder, 2006). As a normative matter, it is impossible to judge foreign policy in a democracy without understanding whether a democratic state's foreign policy is carried out in at least broad accordance with the values and preferences of the public in whose name it is enacted.
Public opinion and cyber operations
Given these factors, it should be clear that understanding how the public evaluates the severity of a cyberattack and how that severity influences public support for retaliatory options merits further study. How, then, to carry out that task?
The secrecy of many cyber operations presents substantial methodological challenges. Such difficulties are exacerbated by the fact that the most theoretically pressing issues often concern anticipated reactions to categories of cyber events that have not yet taken place. Despite decades of worries about a “Cyber Pearl Harbor” or similar cataclysms, most cyberattacks take tamer forms, such as defacing web sites or carrying out distributed denial of service attacks. To be sure, more severe cyberattacks have occurred, such as the use of the Stuxnet worm (Lindsay, 2013). Even if cyberattacks have (probably) not yet killed anyone, they have led to significant economic impacts: Maschmeyer (2021: 81–2) reports estimates that the NotPetya attack reduced Ukraine's GDP by 0.5 percentage points in 2017. Understanding both how the public will evaluate more severe attacks than those that have occurred and how the severity of an attack (separate from any particular circumstances of its manifestation) will affect the public are both essential tasks.
Some may object that researchers should refrain from studying hypothetical situations. Yet researchers in other contexts have made progress despite similar challenges. 2 Moreover, theorizing often proceeds from the assumption that more severe attacks are not only technically possible but worth considering in detail despite their hypothetical nature. For example, Kostyuk et al. (2018) fill out the higher rungs on their escalation latter by postulating that cyberattacks causing “[p]ermanent damage to civilian infrastructure such as power and utility grids” could become “a catastrophic attack affecting millions of people” leading to “devastating casualties comparable to a small-scale nuclear strike”. If theorists are considering these possibilities, then testing their hypotheses is an urgent task.
Through the investigation of hypothetical events, researchers can evaluate plausible but potentially misguided assumptions about public opinion and other factors. That should place theory and policy on firmer foundations. Furthermore, whether an attack is hypothetical may be a transient rather than an enduring state of affairs. Changes in political contexts or technical factors could make more destructive cyberattacks more likely in the future even if they have not been manifest yet (see also Borghard and Lonergan, 2017; Healey and Jervis, 2020). If greater understanding offers the promise of better policy, then the benefit of researching these topics before a crisis actualizes hypothetical scenarios is clear.
Recent studies suggest that further investigation of hypothetical events is warranted in the cyber context in particular. Despite the early focus of cyber literature on war at the push of a button and an image of widespread, disruptive cyber conflict, experimental research typically finds little support for escalatory responses among the general public. Respondents are less likely to support escalation against a cyberattack compared with other forms of attack (Kreps and Schneider, 2019). They are also less likely to support the use of cyber operations as a foreign policy tool when informed about the (dangerous) effects of these operations (Shandler et al., 2021b). Cyber operations may offer a “valuable escalatory offramp” because of the difficulty surveys and simulations have faced in stimulating escalatory responses to them (Jensen and Valeriano, 2019). Nonetheless, experiments have also shown that cyber terrorist attacks may increase support for retaliatory military strikes if the cyberattack produces lethal effects (Shandler et al., 2021a). Exposure to cyber terrorism produced a heightened perception of risk and support for forceful government policies like Internet surveillance, government regulation of the Internet and military reprisals (Gross et al., 2017).
As productive as this research area has been, however, it does not directly speak to the questions of how different levels of severity and other factors across the likeliest severe attack scenarios (or at least ones likelier than cyber Pearl Harbors) affect the public's assessment of a situation and how to respond. Much variation remains unexplored, and much of this research has focused on questions that are, although similar, separate from those that animate our research.
Consequently, further investigation is required in three major areas. First, empirical research can establish what elements influence the public's evaluation of the effect of an attack: what makes an attack be regarded as more or less severe, and, related, what levels of severity prompt support for more serious forms of reprisals. Second, research can establish whether the characteristics of a target affect how the public evaluates a cyberattack. Farrell and Glaser, for example, argue that distinctions between civilian and military targets will matter. Such conjectures should be tested to establish whether such an effect is present, the magnitude of such an effect, and its direction. Third, research in cognate areas (such as terrorism) has often taken into account the motivations and identities of attackers to understand how these factors can inform public evaluations of attacks and support for responses. Some cyber researchers, particularly Lonergan and Lonergan, foreground this type of concern, while others tend to accord it less significance. Empirical evaluations should establish whether and how much such factors weigh on the public.
We review and describe the need for further research for each category in turn.
Effects of attacks
One area of investigation concerns the relationship between the effects of an attack and the response of the public. It might seem not only plausible but obvious that more severe cyberattacks should produce support for proportional or even more severe retaliation. Yet there are two reasons to be uncertain about that conclusion.
First, some argue that cyber operations may serve as an offramp from a crisis rather than an escalator (Borghard and Lonergan, 2017; Kreps and Schneider, 2019; Lonergan and Lonergan, 2022; Valeriano et al., 2018). In that case, a severe cyberattack might serve to conclude a crisis rather than lead to its escalation. A key debate concerns why this may be the case: the inefficacy of cyber operations, a public disbelief that cyber weapons are really weapons, or some other factor. For some, like Healey and Jervis (2020), the debate results from misperceptions: different mechanisms may come into play depending on the context, including the severity of an attack and whether it is “brazen or reckless enough to demand a muscular response from the target”. Accordingly, they argue, mechanisms operating during the relatively tame attacks of the past few decades may have tended to produce de-escalation, but more severe scenarios in the future could trigger latent mechanisms that spark a more direct response. Testing the relationships between effects and the public's evaluations of a cyberattack across a broader spectrum of effects will allow for greater clarity about the causal mechanisms at play.
Second, researchers have not yet reached a consensus regarding what the public (or policymakers) will or should regard as a substantial cyberattack. 3 Many expect that an attack that produced even a single fatality would be automatically regarded as significant (Farrell and Glaser, 2017). How much blood would trigger how great a response remains unclear. Similarly, the gradations of severity below the hypothesized bright line of a lethal attack, or among attacks with the same levels of lethality, remain unclear even as they are theorized in detail. Kostyuk et al. (2018) propose a ladder of escalation in cyberspace that carefully distinguishes among many ordinal rungs of escalation using categories such as “catastrophic attacks”, “major damaging attacks”, “minor harassment”, and so on. Despite being clear about the ordinality of the rungs, however, they do not specify their cardinality: how many deaths or how many dollars would place an attack on a given rung. Farrell and Glaser (2017) focus instead on the “saliences” of particular norm violations. They hypothesize fine distinctions, including among the types of targets (military and civilian, infrastructural or private), the types of damage (physical or non-physical), and even whether damage is obvious or obscure, as part of their argument about how normative factors will operate in cyberspace. Although at times more specific about quantities and targets, the relative effect of different saliences remains somewhat unclear and untested.
Such disagreements require empirical adjudication. Prior empirical work sheds little light on these debates. Kreps and Das (2017) experimentally manipulate whether cyberattacks target nuclear reactors or the banking system as well as the severity of those attacks. They find high support for retaliatory airstrikes in a treatment scenario involving hundreds or thousands of American casualties but little variance in their experimental effects of moving from the lower treatment condition to the higher one. This finding, however, may be an artifact of their truncated range of treatments. The minimum economic damage of an attack in their experiment is $3 billion, while the minimum number of deaths in the nuclear scenarios is in the hundreds and the maximum is in the thousands of deaths—an amount that makes this treatment arm equivalent to a literal cyber 9/11. Furthermore, their research design avoids intermediate ranges of cyber operations in which we might expect conflicting or ambiguous predictions about how states might react (Schram, 2021).
A different issue is raised by Kreps and Schneider (2019), who vary the mode of an attack (cyber, nuclear, or conventional) and whether it produced economic damage, significant numbers of deaths, or nuclear radiation. This raises a potential methodological objection as, following Farrell and Glaser (2017), one might expect that choosing to attack a power company or nuclear reactor might provoke different kinds of reactions than attacks on corporate or military targets. Similarly, the treatment for a key severity condition may be overly vague: respondents in the relevant treatment conditions are informed only that “significant casualties” took place, and the terms “death” and “fatalities” are not used in their survey instrument. Given that casualties can mean killed or wounded, it is not clear from the reported results how to evaluate the significance of these findings. This does not invalidate their research design, which was aimed at testing a different theoretical question, but it does mean that designing treatments with greater clarity can provide an advance over the state of the experimental art in the cybersecurity field for our question.
Other studies similarly leave open key questions related to attack severity and preferences for retaliation. Shandler et al. (2021a) adopt what is in many ways a promising approach in their multicountry study by comparing cyber and non-cyber terrorist attacks in lethal and non-lethal versions. Their treatment, however, reduces lethality to a binary (either no deaths or a few), rather than a more graduated continuum. The small numbers of deaths that they test may be realistic compared with Kreps and Das, but they do not match the potential damage that many in the scholarly and, especially, policymaking and public circles have claimed could result from a major attack (Sanger, 2019). Although this treatment design was valid for their purposes, it does mean that researchers still lack answers to the questions regarding a wide intermediate range of fatalities on which theoretical debates hinge.
Theorists, policymakers, and applied researchers need to know the functional form of the relationship between different factors producing evaluations of severity and support for response options. Schematically, we could imagine a world in which the relationship between lethality and support for severe reprisal is a bright line dividing no casualties from casualties; a postage (piecewise) function, in which clearly delineated rungs on the escalation ladder correspond to different casualty thresholds; or a linear, logarithmic, or even parabolic function. Each of these scenarios would point to radically different understandings of how states would interpret and respond to a cyberattack. Given the importance of anticipated and strategic interactions, such anticipations could also shape the use of cyber operations as states will unavoidably base their own actions on assessments about the behavior of their targets. This point is not limited to fatalities. Previous research has tended to neglect variation in economic damage in favor of focusing on target, cross-domain deterrence, and fatalities. Yet audiences may make inferences about an attack's severity and how to respond based upon economic damage. Investigating how variations in economic damage influence respondents’ evaluations is, consequently, an urgent area for research.
Characteristics of targets
The target of a cyberattack might influence how its significance is perceived. Lonergan and Lonergan (2022) state that perceptions of how critical a target is influence the magnitude of the effects of cyber operations. Farrell and Glaser (2017) specifically predict that attacks on civilian and military targets, or attacks on infrastructure relative to other forms of targets, will lead to different forms of retaliation. Following a similar intuition, Kostyuk et al. (2018) split attacks against military infrastructure and civilian infrastructure into different rungs of their escalation ladder.
Despite the theoretical importance of such distinctions, the empirical literature has not yet fully explored differences among categories of targets. Studies have referred to power plants (Kreps and Schneider, 2019), railway networks (Shandler et al., 2021a, 2021b), and missile, electrical, and water systems (Gross et al., 2017), but none of these studies vary the type of target that is under attack within each experiment. Kreps and Das (2017) do vary whether banks or nuclear power plants are attacked, yet their design raises potential objections. In particular, the prominence of the nuclear treatment (and associated effects like radiation sicknesses) may conflate deaths caused by a cyberattack, normative violations against targeting civilian infrastructure, and revulsion related to nuclear and radiological weapons. These situations are probably not fully comparable with the purely economic attack in their control arm.
Given that theoretical arguments strongly and plausibly claim that target type should be a major factor in affecting evaluations of attack severity and retaliation, tests are needed. If there is a bright line between civilian and military targets, for instance, then policymakers and theorists need to understand such distinctions to better parse the signals that could be sent during a confrontation. Indeed, if adversaries seek to use cyber weapons to send signals of intention (Lonergan and Lonergan, 2022), or if the development of cybersecurity scholarship makes this a self-fulfilling prophecy, then understanding how the American public understands such distinctions may be of even greater relevance for an attacker, who should want to be as informed as possible about the signals they could send—intentionally or accidentally—by attacking one target rather than another.
Aggressor attributes
Another category of theoretically relevant factors concerns the attributes of the aggressor. One factor concerns the identity of the state in which the attackers are based. Theorists tend to believe that identity matters. Kostyuk et al. (2018) states unequivocally that escalation ladders “include country-specific perceptions of various actors and their likely motivations”. Yet applied authors differ on whether and how they address this point in their designs. Jensen and Valeriano (2019) present a cyber conflict between “fake states, Green and Purple, in order to remove respondents from contemporary events as much as possible”. By contrast, Kreps and Das (2017) specify Russia as the attacker “[f]or purposes of external validity” and argue that “if we had used a hypothetical country, it is likely that respondents would have associated the action with Russia” (p. 3). Kreps and Schneider (2019) chose not to specify a country to make their findings “generalizable beyond a specific country”. When they asked respondents what country the respondents believed carried out the attack, however, although 43% reported that they did not think of a specific country, 38% reported that they envisioned North Korea. Gross et al. (2017) manipulated whether Israeli respondents were informed that the perpetrator of a cyber terrorist incident was Hamas or the hacktivist collective Anonymous, and found little difference between respondents’ evaluations. Ex ante, such variation suggests that researchers believe that it is difficult to know which approach is best or that they do not believe such variation matters.
It may be that there is no single best approach to take. Recent research suggests that abstract manipulations may be equivalent to particular ones (Brutger et al., 2022). Nevertheless, given the emphasis that scholars have placed on actor-specific characteristics and on perceptions of international rivals (Gomez, 2019; Schulzke, 2018), we believe that testing whether there is variation among an attacker's origin country is appropriate. One particular origin country that studies have overlooked, moreover, is the USA itself. Home to a vibrant Internet sector as well as significant criminal and anti-government activity, it is at least possible that some categories of actors within the USA could carry out a cyber operation against American targets. Such a comparison can also tell us if the public is willing to penalize a domestic attacker more harshly than an international rival and whether similar or distinct factors influence their preferences over response options. This would inform not only theory but also future experimental design.
A related set of factors regarding the attributes of aggressors concerns the motivations and type of attackers. Cybercriminals or cyber terrorists (including lone wolves) may operate from a state's territory without being directed by that state's government. Under normative, effects-based, or signaling logics, it seems more generally plausible that respondents would distinguish between the responses appropriate to attacks undertaken by state, parastatal, quasi non-state, and non-state attackers. (Signaling theories in particular should probably predict greater willingness to countenance more severe reprisals against non-state actors.) Most previous empirical work, however, does not vary the type of perpetrator committing a cyberattack within an experiment. Recent work has emphasized the motivations of attackers (Gomez and Whyte, 2021), but most earlier studies did not vary the motivations and objectives of perpetrators. Real-world cases have hinted at the significance of this factor for some decision-makers. After North Korea hacked Sony Pictures in apparent retaliation for the satirical film The Interview, President Barack Obama apparently weighed seriously how that attack had been motivated by a threat to fundamental American values in determining his response (Sanger, 2019: 146). There is an even greater warrant to believe that such factors could shape responses based on longstanding arguments regarding how similar factors influence the public's evaluations of terrorism (Kydd and Walter, 2006)—or even whether an attack is categorized as terrorism (Huff and Kertzer, 2018).
Finally, cyberattacks do not take place under conditions of perfect information. It may be difficult to attribute an attack to a certain perpetrator (Lindsay, 2015; Rid and Buchanan, 2015) and even more challenging to convince a potentially skeptical public of that attribution (Schulzke, 2018). Given the possibility of doubt in official sources, particularly intelligence agencies (Rovner, 2011), some have suggested that independent experts and academics could verify intelligence agencies’ claims (Egloff, 2019). Empirical studies, however, have mostly left the relationship between attribution and support for retaliation unexamined. Kreps and Das (2017) tested whether higher uncertainty about the identity of the aggressor led the public to be less supportive of stronger retaliatory measures. They did not find any significant effects. As they note, their results could stem from respondents being unable to differentiate their treatment levels of “probably” from “almost certain” in their experiment. Leal and Musgrave (2022) found a complex relationship between attribution and retribution. As they note, their results are too ambiguous to throw out the link between attribution and retribution and suggest that investigation of this point is clearly warranted.
Hypotheses
These theoretical conjectures and empirical gaps lead us to focus our investigation on five broad families of hypotheses.
Given the breadth of theoretical conjectures that have been made and the still-emerging state of the literature on cybersecurity issues, these hypotheses are exploratory. For some cases (e.g. H2 and H3), the null hypotheses are substantively and theoretically interesting in themselves. Finding that respondents do not distinguish between target types, for instance, would call into question the feasibility of Farrell and Glaser’s (2017) arguments regarding building normative prohibitions to the targeting of civilian infrastructure. Similarly, finding that respondents do not distinguish attackers based upon their motivation or type would suggest that cyber and conventional terrorism are evaluated by different logics (Huff and Kertzer, 2018). In other cases, especially H1, our interest is in using the mechanics of hypothesis testing to establish effect sizes—especially the magnitude of the differences between effect sizes for economic and human casualties. Finding that respondents do not react to these conditions would call into question effects-based theories of cyber retaliation. No matter how the public evaluates the severity of cyberattacks, we expect that H4 will be supported. Finding that respondents do not favor harsher retaliation to more significant attacks would mean that the public is taking additional information, like attribution certainty, into consideration.
Methodology
We employ conjoint methodology. This experimental technique allows researchers to simultaneously vary many factors (Hainmueller et al., 2014; Hainmueller and Hopkins, 2015). The approach is increasingly popular in foreign policy analysis (Clary and Siddiqui, 2021; Escribà-Folch et al., 2021; Kohama et al., 2022). Conjoint designs return useful information even when respondents are asked to complete many tasks or tasks with many different features (Bansak et al., 2018, 2019). Given the wealth of previous studies and theorizing regarding the general topics in which we are interested, we can do so in a theoretically and empirically informed way that may not have been possible at an earlier stage of the field's progression.
Experimental design
Our conjoint experiment presented respondents with five pairs of profiles depicting hypothetical cyberattacks. These profiles varied attributes including the economic and human casualties caused by the attack; the type of target being attacked; and the attackers’ origin, motivation, and type. 4 Our dependent variables included assessments of the severity of each attack and respondents’ endorsement of different potential retaliatory measures.
Independent variables
We specified economic damage using four levels of damage: minimal economic damage; a few million of dollars; tens of millions of dollars; and billions of dollars. We also used a four-level scale to measure human casualties: no deaths; one death; dozens of deaths; and hundreds of deaths.
We included organizations from five sectors to measure target type: large US corporations; the US banking and financial system; hospitals across the USA; the US power grid; and the US military's computer networks. While the first and the last types provide good representations of private and public sectors, respectively, the others fall between these two poles. Moreover, all such are frequently mentioned in discussions of cyber operations, making these treatments particularly theoretically relevant.
To measure the attack source, we included six countries. Following previous work about the effects of rivalry on cyber incidents (Valeriano and Maness, 2015; Valeriano et al., 2018), we included four rivals of the USA that are powerful in cyberspace: Russia, China, North Korea, and Iran. We also included a country that is both enormously powerful in cyberspace and not a US adversary: Israel. To test if the public evaluates domestic and foreign cyberattacks differently, we also included the USA.
We specify attackers’ types and motivations. Attackers could be one of four types: an individual; civilian hackers; a terrorist group; or a government agency. Their motivations were specified as lacking any clear motivation; trying to acquire sensitive information; seeking to retaliate against specific US policies; or seeking to disrupt American society.
We also included three dimensions to explore whether public attribution and endorsements affected our dependent variables. We specified the intelligence community's confidence in its attribution using the three levels: somewhat certain; highly confident; and unanimously confident. To assess independent experts’ endorsement, we included three values: mostly reject it; have mixed views about it; and mostly agree with it. We also varied elite endorsements by the source (either President Trump or then-US Secretary of Defense Patrick Shanahan) and content (disputed the attribution, refused to comment, or endorsed it). 5
Dependent variables
We measured dependent variables including the severity of the attack and respondents’ favored retaliation options. Respondents rated each profile separately. We chose this paired-rating approach because similar paired-rating conjoint experiments have been shown to recover real-world choices even better than the more common forced-choice variants (Hainmueller et al., 2015).
To measure severity, respondents rated the severity of the attack represented in each profile separately from 0 (“Not severe”) to 100 (“Very severe”). They were then asked to indicate whether they supported six possible retaliatory actions. Table 1 shows the options for the international and domestic variants. The answers were binary: “Would support” or “Would not support”. Other experiments and public opinion polls, like the Chicago Council surveys, have used similar lists of possible response options (Kreps and Schneider, 2019; Shandler et al., 2021a). To compare support across international and domestic contexts, we constructed a six-point ordinal scale measuring the most severe retaliatory option supported, with 0 meaning supporting doing nothing and 5 meaning supporting the most severe response (airstrikes or the death penalty).
Retaliation categories for domestic and international attacks.
Designing a comparable scale for international and domestic responses proved a challenge. We attempted to make sure that these were proportional while also fitting the range of tools available to policymakers in the foreign and domestic realms. 6 To minimize question-order response effects, we randomized the order of these potential responses.
Participant variables
We also measured information regarding respondents’ political affiliation, gender, racial identification, education (operationalized as having a college degree or higher), and age (defined as the year of the survey minus reported birth year). Our panel structure allows us to measure these without undue concern that such measurement might prime respondents to answer in particular ways, as Berinsky et al. (2012) recommend.
We also collected information regarding respondents’ attitudes toward cybersecurity and individual countries. We measured warmth toward countries involved in the experiment (as well as several distraction countries) on a 0 (“very cold”) to 100 (“very warm”), with 50 as “neither cold nor warm”. We repeated a similar exercise using feeling thermometers to measure how much respondents viewed different countries or organizations as posing a threat to the USA in cyberspace (0 = “Not a threat”, 100 = “Major threat”). We asked respondents to rank several threats to the USA as “critical”, “important but not critical”, and “not important”, including cyberattacks on US computer networks. Respondents also indicated whether they believed that attacks on computer systems “will personally impact you” (four-point scale from “very worried” to “not at all worried”) and which organization they thought should be most responsible for providing cybersecurity (the federal government, ISPs like AT&T and Comcast, major corporations, or individuals and private businesses).
Experiment
We fielded a conjoint experiment on CloudResearch (formerly TurkPrime), a service that recruits and screens workers on Amazon's Mechanical Turk (“MTurk”) platform, between 24 March and 20 April 2019. Recent work validates MTurk as a source of quality research that can, at far lower cost, even recover similar effects as nationally representative samples (Coppock, 2019). Research specifically testing CloudResearch candidates, such as we use here, finds that it offers even higher data quality, comparable with Prolific (Eyal et al., 2021).
We collected responses in a panel with two waves. Wave 1, for which respondents received $0.75, measured demographics and baseline attitudes from 2031 respondents. Respondents completed this survey in an average of 316 s (5.27 min) and a median time of 260 s (4.33 min). These participants were recontacted for wave 2, which contained the conjoint experiment and for which respondents received $1.00. For this wave, we successfully recontacted 1233 respondents who passed data integrity checks (a recontact rate of 61%). These respondents completed ratings of 10 profiles each, for a total N of 12,330. Respondents completed this wave in an average of 784 s and a median of 693 s (11.55 min).
One potential objection to convenience samples concerns whether they are unrepresentative of the target population (in this case, American adults over 18). This objection only holds for experiments if there is some reason to believe that causal effects are strongly heterogeneous across demographic groups. Our sample was not grossly unrepresentative on observable characteristics. Table 2 displays summary demographic information for the sample, excluding those who failed data integrity checks. In comparison, Gallup polling between 17 and 30 April showed a party ID of 29% Republican, 40% Independent, and 29% Democratic. 7 Census figures for 2019 show a median age of 38.4, 8 and the 2020 Census shows that 71% of Americans identified as White alone or in combination. 9 Although our sample is more Democratic-leaning, more educated, Whiter, and less female than the US population, it is not entirely unrepresentative, nor (as Table 2 suggests) did we suffer severe differential attrition by a major demographic group.
Summary statistics for panel by wave.
Another concern involves data quality. Then-recent events raised concerns about the reliability and location of MTurk workers (Dreyfuss, 2018). We addressed this potential threat in several ways. We used the CloudResearch tools to block suspicious IP addresses, verify workers’ country locations, and prevent duplicate IP addresses from taking the survey twice (Litman et al., 2021). We also employed simple data integrity checks. In wave 2, we asked respondents their gender, birth year, and educational attainment, which we had asked for in the first wave. Any respondent who submitted discrepant information was dropped, including 62 respondents whose educational attainment differed, 41 whose age differed, and one whose gender report was inconsistent. This was a conservative approach. The average age difference between waves was 0.07 years for all respondents and only 1.73 years for those reporting an age discrepancy, although six respondents reported differences of between 10 and 46 years. Similarly, most reported educational attainment differences were slight, as with the 10 respondents who reported having a bachelor's degree on the first wave and an associate's degree on the second, or the six respondents who oscillated between having a professional degree like a JD and a master's degree. Nevertheless, we decided to err on the side of caution.
One common way of administering conjoint experiments consists of presenting respondents with a “task” that involves rating (or choosing between) two profiles, usually presented in a spare tabular format. To help respondents place these profiles into a coherent gestalt, we first had respondents complete two training exercises presented in narrative form. These training exercises were static (although not described to respondents as such) and designed to provide maximum variation on different aspects. For example, the first scenario involved the following: A network intrusion into the computer systems of the US military has been detected. The incident caused billions of dollars in damage and led to hundreds of deaths. American intelligence agencies are somewhat certain the intrusion was carried out by a government agency based in North Korea.
The second one involved the following: A network intrusion into the computer systems of the US banking and financial system has been detected. The incident caused minimal economic damage and led to no deaths. American intelligence agencies are unanimously certain the intrusion was carried out by an individual based in the USA.
By collecting data on the dependent variables in the invariant training vignettes, we also collected information to allow us to control for different respondents’ baseline levels of severity in an approach inspired by the anchoring vignette method of King and Wand (2007), since all respondents took the same test. Finally, the training vignette demonstrates some face validity of our approach. The mean severity rating of the first, significant attack was 82.2 (median of 89), while the median severity rating of the second, not significant attack was 39.2 (median of 33). Similarly, 55.6% of respondents chose the two harshest reprisal options in the severe scenario, compared with only 16.38% in the not severe scenario.
Results
Severity of cyberattack
We begin by discussing the factors that influenced respondents’ evaluations of the severity of an attack. Figure 1 presents selected results. 10 We present both a baseline specification, involving only the experimental manipulations in the conjoint experiment itself, and a version including the participant variables we described earlier (demographics, party identification, training vignette evaluations of severity, and so on). Notably, the two specifications return almost identical results.

Results from ordinary least squares regressions of respondents’ ratings of attack severity. Baseline model includes only experimental manipulations, while the full model includes participant variables as described above. Full results available in the Online Appendix.
Recall that H1 posited that “Respondents will evaluate attacks as more severe the greater the casualties inflicted in human and economic terms”. The results plainly demonstrate that we can reject the null of H1. There is a substantial difference between minimal economic damage and economic damage reaching into the billions of dollars; moving up each rung on the ladder of economic damage produces a major increase in perceived attack severity. Even an attack causing a few million dollars’ worth of damage would be perceived as 10 points more severe on a 100-point scale than one with minimal damage.
Similarly, attacks causing large numbers of deaths are perceived as far more severe than those causing no death or even a single fatality. 11 That is unsurprising. What is more surprising is that our results suggest that respondents value some levels of economic damage almost as highly as they do even dozens of deaths. To explore whether this reflected ceiling effects, we examined ratings for only those profiles in which the highest two categories of human casualties were presented. When deaths are at the highest level, ratings of severity do not substantially vary with increasing levels of economic damage, suggesting that there are some ceiling effects. Yet the ceiling is not overly low. Ratings of severity increase with increasing economic severity when deaths are “only” in the dozens, suggesting that respondents are indeed sensitive to economic casualties separate from human casualties. Our results thus suggest that respondents did indeed evaluate attacks causing medium to high levels of economic damage (tens of millions or billions of dollars) as equivalent or even more severe than attacks causing a single fatality.
H2 posited that “Respondents will distinguish between attacks on different types of targets, ranking some as more severe than others”. We find support for this hypothesis, as attacks targeting hospitals, the power grid, and the US military were ranked as somewhat more severe (about 3 points on the 100 point scale) than attacks on the excluded category, large US corporations. These effects, however, are smaller than the purely effects-based categories of economic and human casualties.
The final hypothesis dealing with severity, H3, posited that “Respondents will make distinctions between attackers’ motivations, types, and country of origin in the evaluation of severity and retaliatory options”. Here the evidence is mixed. Aggressor motivations do not seem to affect evaluations of severity in a statistically or substantively significant way. Nor, in this specification, does the country of origin. There is a very substantively small (about 2 points), statistically significant effect in the full specification between an attack carried out by a government agency and one by the excluded category, an individual, but otherwise aggressor type does not matter in these specifications.
We also present results from two other conjoint manipulations that were statistically significant: intelligence community confidence in attribution and independent endorsement of the intelligence community's confidence. We had no theoretical reason to believe these would be significant predictors ex ante.
In the Online Appendix, we present additional specifications that show subgroup variations. We find few substantively important distinctions when examining responses by gender, education, or party identification. There are a few more interesting variations when we examine responses by baseline attitudes toward relevant factors. Respondents who believe that the federal government, rather than the private sector, bears responsibility for cybersecurity are more consistent and pronounced in finding a distinction between targeted sectors. Those who believe the private sector should shoulder the responsibility for cybersecurity, in contrast, find attacks on hospitals, the power grid, and the US military less severe than attacks on private corporations and banks. Respondents who view cybersecurity as a more pressing threat are much more elastic to human and economic casualties in their evaluation of attack severity compared with those who view it as a less critical threat. Similarly, respondents who report being personally worried about cyberattacks rate attacks more severely at increasing levels of human casualties than those who report not being very worried. We also split responses by attack country of origin and find few differences, although attacks from North Korea and Israel seem to be viewed slightly more severely; this suggests that our omitting highly implausible treatment profiles did not substantially bias our estimation. Finally, and perhaps most interesting, in models that include statistical controls for respondents’ baseline perceptions of the threat that foreign countries pose (measured in wave 1), the causal effect of the country of origin changed. Specifically, adjusting for pre-existing perceptions of threat made evaluations of attacks carried out by Russia and China slightly but significantly less severe than attacks carried out by Iran or North Korea, a roughly 2 point decrease on the 100 point rating scale.
Retaliation to cyberattacks
We now turn to an analysis of respondents’ support for retribution. We recode the dependent variable into an ordinal variable in which the score from 0 to 5 reflects the score of the harshest retaliation option respondents would support. 12 That is, a score of 0 means a respondent supported none of the specific reprisal options given, while one of 3 means supporting either applying sanctions or a large fine (but nothing more severe) and one of 5 means approving of airstrikes or the death penalty. This allows us to analyze respondents’ support using ordinal logistic regression.
Figure 2 displays selected results. (Full results are available in the Online Appendix.) We present results in both a pooled specification, mixing international and domestic experiments, and broken down by aggressor origin country (whether domestic or international). Notably, respondents display a much greater elasticity with respect to human casualties in the domestic scenarios compared with international ones, while in other factors they respond similarly across international and domestic environments. H3 proposes that respondents will distinguish between attackers’ motivations, types, and country of origin in evaluations of retaliatory options. These results demonstrate mixed support. Respondents are more willing to adopt harsher responses to attacks from Iran and North Korea compared with the excluded category of Israel. They are also more likely to support harsher reprisals against terrorists or a government agency. Attackers’ motivation, however, continues to not matter at conventional levels of statistical significance. On the other hand, increased certainty about the identity of the aggressor reaches conventional levels of statistical significance.

Results from ordinal logistic regressions of respondents’ ratings of response options. Model includes participant variables including demographics and partisan ID, technical features including the task number and the profile left- or right-hand side presentation, and scores indicating the highest reprisal option scored for each of the two training vignettes. Full results available in the Online Appendix.
In the Online Appendix, we present additional results. We find few differences in responses by gender. Republicans display more variation in distinguishing between country of origin and are more likely to respond to the highest levels of economic and human casualties than Democrats or Independents. College-educated respondents are less likely to distinguish between attacks on different target types than non-college educated respondents. Respondents who believe that cybersecurity is a federal responsibility make more distinctions between aggressor country of origin but otherwise answer similarly to those who believe cybersecurity is a private responsibility. Respondents who believe that cybersecurity is a critical threat favor more severe responses to attacks coming from Iran, North Korea, and the USA; attacks on the electrical power grid or the military; attacks that kill dozens or hundreds of people; and attacks that are carried out by terrorists or the government. Indeed, baseline attitudes on this point produce some of the most striking subgroup variations. Finally, once again, including measurements of respondents’ baseline attitudes toward the threat that different countries pose turns an attack by Russia and China into a strongly and significantly negative predictor of support for harsher retaliation compared with one from Iran or North Korea while leaving other measurements otherwise unaffected.
H4 proposes that respondents will favor more escalatory responses to more severe attacks. We operationalize this test in two ways. First, we examine the results of the model in Figure 2, which offers support for this hypothesis as there is clearly increasing support for harsher reprisal options as the human severity of an attack rises. Figure 3 presents an easier to interpret version of these results. The predicted probabilities clearly show that as the level of human casualties rises from zero to hundreds, the support for choosing more severe retaliation options increases. In ordinal terms, these are broadly equivalent between the domestic and international conditions, but in cardinal terms there is also greater support for relatively more punitive sanctions in the domestic than the international treatment condition—an apparent falsification of H5, which held that foreign attacks would be met with stiffer reprisals than domestic ones.

Predicted probabilities of highest-ranked choice based on models from Figure 2 by different level of human casualties.
We also test H4 by including respondents’ evaluations of the severity of the attack as a separate predictor. (Full results are available in the Online Appendix.) The severity rating is statistically significant in the pooled, international, and domestic analyses. Figure 4 displays predicted probabilities of choosing each option by respondents’ evaluation of attack severity. This figure illustrates the nonlinear relationship between evaluations of attack severity and support for harsher reprisal actions. The figure also demonstrates that the ordinal relationship between different levels of harsher reprisal support is equivalent across the domestic and international conditions, while the levels of support are, as before, somewhat different, with harsher reprisals more supported in the domestic than the international case (another rejection of H5).

Predicted probabilities of highest-ranked choice by evaluations of attack severity.
Discussion
Our findings contribute to active debates in the field. We find, for example, that two major factors—economic and human casualties—drive evaluations of attack severity. Other factors, particularly the target being attacked, also matter for these evaluations, but they are secondary. Many other factors that have been mooted as significant explanatory factors turn out to have much less importance.
We also find evidence that increasing severity (again most prominently involving deaths and economic damage) will prompt the public to support more severe forms of response. These findings give support to effects-based theories of retaliation and suggest that the “firebreaks” argument may be overstated. However, this support is only partial. Even if the effect sizes of human casualties and economic damage are larger than the other attributes we test, we find that the public considers many more factors when weighing its preferences for retaliation, including aggressor type, the origin of aggressors, and even the opinions of independent experts evaluating the credibility of official attribution claims. Finally, of course, we find that perceptions of severity prove to be substantial factors in affecting preferences over retaliation.
Most important, our findings show that under certain conditions Americans are more likely to support harsher retaliation policies. In general, the public wants a response, but its support for escalation is limited. We find increasing support for increasingly escalatory responses as the consequences of an attack become more severe. This may form the basis for a deterrence-by-punishment strategy more calibrated to a tit-for-tat response rather than a massive, overwhelming retaliation. To be sure, our findings also suggest that the public prefers retaliation-in-kind to escalation generally, which may hinder cross-domain deterrence strategies.
Conclusion
The public's evaluations of the severity of cyberattacks and how to respond to them follow different, subtler patterns than some have suggested. Although our work confirms that some factors that have been proposed as significant influences on public opinion (like fatalities) do matter, our findings also show that many other factors, such as motivation and the type of target of an attack, matter less, or in some cases not at all.
Our findings demonstrate that the effects of an attack matter for the public's evaluation of its severity and how to respond. This relationship, however, is not linear. Support for more severe retaliatory options rises in a curve as evaluations of attack severity increase. There is no bright line between a severe and a less-severe attack; rather, both evaluations of attack severity and preferences over retribution are usefully conceived of as continua. Earlier work focused on either the least or most extreme parts of this spectrum and so has missed the intermediate stages that we focus on. These empirical findings suggest that different preferences, and potentially mechanisms, may be operative at different levels of severity.
Our work holds implications for scholars and policymakers. Empirical researchers may find our results useful in refining subsequent studies of the relationship between public opinion and government actions. Theorists may find our results useful in refining their explanations and predictions—perhaps in part to simplify their theories by pruning factors that seem less significant than intuition would suggest Policymakers should be aware that the public prefers cyber retaliation but supports escalation only conditionally. In general, the public prefers to respond to cyberattacks with cyberattacks, but there is some pressure for harsher responses as the severity of an attack increases, especially if the aggressor is a US citizen. At the international level, those pressures will be greater when dealing with minor powers compared with great-power rivals or friendly-to-neutral countries. All these points should be taken into consideration when developing US cybersecurity strategy.
Future research should continue to explore whether these results hold beyond in the US case. Research suggests that decision-making processes and evaluations of attack severity do indeed differ cross-nationally (Gomez and Whyte, 2022). Future work could also investigate why respondents are less supportive of harsher responses against peer competitors, like Russia and China, than against Iran and North Korea. The mechanisms at play could include a sense of superiority vis-à-vis smaller, weaker countries (Musgrave, 2019), a fear of nuclear or conventional reprisal from peer competitors, paired with ignorance of the nuclear capabilities of North Korea, or a public preference for short, “winnable” wars. Regardless, understanding these results seems urgent. Finally, isolating the psychological mechanisms that produce support for escalatory measures could be a valuable step forward. Scholars have found that a higher perception of risk increases the likelihood of individuals supporting harsher measures against cyber rivals (Gross et al., 2017; Kostyuk and Wayne, 2020). Other works, however, contend that emotions affect support for retaliation against cyber terrorism (Shandler et al., 2021a). Understanding which mechanisms are operative and under what conditions could help us in revealing not only when but also why the public prefers escalation over proportional responses and if the public sees different types of cyber operations differently.
Supplemental Material
sj-pdf-1-cmp-10.1177_07388942221111069 - Supplemental material for Hitting back or holding back in cyberspace: Experimental evidence regarding Americans’ responses to cyberattacks
Supplemental material, sj-pdf-1-cmp-10.1177_07388942221111069 for Hitting back or holding back in cyberspace: Experimental evidence regarding Americans’ responses to cyberattacks by Marcelo M Leal and Paul Musgrave in Conflict Management and Peace Science
Footnotes
Acknowledgements
For comments and suggestions, we thank the reviewers and the editor of Conflict Management and Peace Science, as well as Brandon Valeriano, Tatishe Nteta, Jesse Rhodes, Ray La Raja, Scott Blinder, Douglas Rice, and participants at the 2019 annual meeting of the Midwestern Political Science Association and the 2021 annual meeting of the American Political Science Association.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
