Abstract
Electronic voting promises the possibility of convenient and efficient systems for recording and tallying votes in an election. To be widely adopted, ensuring the security of the cryptographic protocols used in e-voting is of paramount importance. However, the security analysis of this type of protocol raises a number of challenges, and they are often out of reach of existing verification tools. In this paper, we study vote privacy, a central security property that should be satisfied by any e-voting system. More precisely, we propose the first formalisation of the recent
Introduction
Remote electronic voting systems aim at allowing the organisation of elections over the Internet, while providing the same guarantees as traditional paper voting. Although relying on e-voting for large-scale elections is controversial, it is already in use in many lower-stakes elections today (e.g. the Helios (1) voting system has been used to elect the IACR Board of Directors since 2010), and is likely to be used even more in the future, for better or for worse. These elections may involve a large number of voters and may have an important impact on democracy when it comes to electing political leaders. It is therefore of paramount importance to ensure the security of these systems.
As for security protocols, in general, formal methods provide powerful techniques to analyse e-voting systems and prove their security. Identifying what makes a good, secure e-voting system is a complex problem that has not yet been completely solved, and is actively being researched. It is, however, rather universally acknowledged that a central security guarantee e-voting systems should provide is vote privacy. Intuitively, this property states that votes must remain secret, so that no one can learn who voted for which candidate.
One common way of formalising vote privacy, which we will call
More recently, a new definition, called
Privacy-type properties, and in particular vote privacy, are often expressed using a notion of behavioural equivalence (13). A notable exception is the definition of
Contributions
Our goal here is to propose a methodology to enable the analysis of First, we propose a definition of Second, we identify some conditions under which Finally, we apply our result to several e-voting protocols from the literature relying on the tool
Related work
Developing reduction results to ease protocol security analysis is not a new approach. For instance, this approach has been used to bound the number of agents involved in an attack for both reachability (22), and equivalence properties (23). Reduction results bounding the number of sessions (24; 25) have also been proposed in more restricted settings. All these results do not apply in the context of e-voting protocols. Here, we would like to bind the number of voters (agents) participating in the election. However, since only one vote is counted for each voter, we cannot replace a session played by
The only existing result in that context is the result proposed in Arapinis et al. (6), where the authors give bounds on the number of voters and ballots – respectively,
This paper is an extended version of our work (26), published at the 27th ESORICS conference (2022): in particular, the
In this section, we introduce background notions on protocol modelling. We model security protocols in the symbolic model with a process algebra inspired by the applied pi calculus (27). Participants are represented by processes, while messages exchanged between participants are represented by terms. Our model is mostly standard, except that in order to model the stateful nature of e-voting protocols, we consider memory cells, that can store a persistent state across processes. We need to avoid concurrent accesses to memory cells while updating them: to that end, we use a specific instruction that atomically appends a message to the content of a memory cell.
Messages
We assume an infinite set
We consider the signature
Let
An element of
Continuing Example 1, we consider the equational theory
In order to provide a meaning to constructor symbols, we equip (constructor) terms with an equational theory. We assume a set
In addition, the semantics of destructor symbols is given by a set
We extend the notation
Consider
The destructor defined in Example 3 may seem of little use, since it does not let an attacker compute any value he did not already know. It does indeed not bring extra power to the attacker. However, when dealing with the case of a dishonest ballot box, having such a construction will make it easier to write recipes used in our reduction result.
In the following, we consider an arbitrary signature
We model protocols using a process calculus. We consider an infinite set of channel names
This syntax is rather standard, except for the memory cell operations. Intuitively,
Continuing our running example, we consider the process
The semantics of our calculus is defined as a transition relation

Semantics of our calculus.
For instance, considering an input on a public channel, that is, the rule
Continuing Example 4 with
Our definition of the
When establishing our reduction result, we will reason on the notion of static equivalence. In particular, we will assume an attack trace exists, and that this attack comes from publishing the result of the election, that is, that the two processes are in trace equivalence until the result is output. In such cases, we will deduce that the results output by the tally on either side are different (modulo
Let
First, assume that
Now, we assume that
Trace equivalence is the active counterpart of static equivalence. Two configurations are in trace equivalence if, however, the attacker behaves, the resulting sequences of messages observed by the attacker are in static equivalence.
We can consider a configuration
The resulting trace equivalence does not hold. This is simply due to the fact that
In the following, we will consider action-deterministic configurations. Intuitively, for an action-deterministic configuration
Consider two ground processes either there does not exist or such a
Indeed, once
In this section, we present our formal model of e-voting protocols and our
Modelling e-voting protocols
When modelling voting systems, we often need to encode some computations (e.g. performed by the ballot box) that cannot be represented by recipes (e.g. iterating through an arbitrary-sized list). We encode these computations as processes that do not share any names, channels, or memory cells with the rest of the process, except for a channel to return the result of the computation.
This process must be such that for all inputs
To use such a process to compute a term inside a process
We assume a set
Each voter has a private credential
To model the construction of ballots, we assume a recipe
When modelling vote privacy, the attacker chooses the vote
The formal definition of the voter’s process is given in Section 3.2 as it incorporates elements specific to the modelling of the property.
Continuing Example 2, for Helios, we use the following recipes:
We continue Example 7 and we consider for simplicity the case of a referendum with two possible votes
We model vote privacy by adapting the
The attacker has complete control over the ballots submitted by dishonest voters. Hence, we model them by a process that receives an arbitrary ballot from the attacker, and adds it to the ballot box
We then consider
We then define the
While we designed our symbolic definition to follow as closely as possible the original computational formulation of the property, there are two notable differences.
First, in the original notion, the oracle modelling honest voters was executed atomically: once the adversary submits his vote instructions, the generated ballot is immediately placed in the ballot box. In contrast, in our formalism, we allow executions where the process
Second, many voting schemes include mechanisms allowing everyone to check that the tallying authority computed the result correctly. Typically, the talliers publish, alongside the result itself, zero-knowledge proofs showing that they, for example, correctly decrypted the ballots in the ballot box. In
In (11), the authors propose two companion properties to
We assume that: if
Lastly, we assume that these functions characterise the behaviour of the
The
We first establish our reduction in the case where voters vote only once. Some systems allow voters to vote again by submitting a new ballot that will, for example, replace their previous one, in the interest of coercion-resistance. We extend our result to that setting in Section 5. Our
Main result
In order to reduce the number of dishonest voters needed to mount an attack against
This assumption needed to establish our reduction results captures the most common counting functions such as multiset, sum, and majority presented below.
The functions
Let
We can now state our main reduction theorem establishing that to study
Let
This theorem is an easy consequence of Propositions 2 and Proposition 4 stated and proved in Section 4.3 and inSection 4.4.
The ballot copy attack on Helios (with the 1-bounded multiset count) from Cortier and Smyth (5), mentioned in Example 6, can be performed against
In accordance with Theorem 1, one honest voter, one dishonest, and one accepted ballot are actually sufficient: the attacker can simply block the honest ballot, so that only the copy is counted leading to
In the proofs in the next two sections (i.e. Sections 4.3 and 4.4), we will start with an attack trace on the election process involving
The two ground processes
For these two processes, until phase
Now, when reaching phase
We can now show that when considering an attack trace
Let
Moreover, for any
Assume first that the minimal witness of this non-equivalence is actually a witness for the following non-inclusion: there does not exist such a trace exists, that is,
We first assume that such a witness of minimal length satisfies the requirements stated in item (1), that is, there does not exist
It remains to establish that
Finally, for any honest voter
When designing symbolic definitions that formalise security properties, even when an arbitrary number of participants are involved, a common modelling choice is to particularise the definition on a small number of honest agents, on which the property should hold. For instance, a key agreement property is often formalised by requiring that two fixed (but arbitrary) honest agents agree on the key at the end of their session, even in the presence of arbitrarily many dishonest agents. A more general definition would require that the same holds for any number of honest agents running the protocol in parallel, so that any two honest agents agree on a key once they finish a session together. The choice of fixing the honest agents when formalising the property produces a simpler property, with fewer honest sessions in parallel, which is usually easier on the automated tools. It is usually justified by arguing (more or less formally) that it implies the more general version: given an arbitrary number of honest agents, for any pair of agents, we can see from their point of view all other agents as potentially corrupted, and thus the simpler property applies and shows they agree.
A similar choice is implicitly made when considering the swapping definition for vote privacy. Indeed, the more general version would require that two scenarios, where the votes of any number of honest voters have been permuted, are always indistinguishable. This general formulation would in fact be closer to the one used in the computational setting. In contrast, the symbolic swapping definition considers two particular honest voters Alice and Bob, whose votes are exchanged. To justify this choice, it could be argued that, as any permutation can be decomposed in a finite sequence of swaps of two elements, by applying the seemingly weaker property as many times as needed, we can recover the general version. This argument is however not often formalised.
In order to remain faithful to the original computational
Consider a voting scheme
The general idea of this proof is to show we can isolate one specific honest voter whose ballot is the one causing
The difficulties are (i) how to find this particular voter and (ii) how to simulate honest voters with dishonest ones. The simulation would be easy for an honest voter
To solve both issues, the main idea is to go gradually from the

Proof of Proposition 2 – intermediate processes
Note that, in the case of the earlier reduction result from Arapinis et al.(6) for the
Before proving the reduction result, let us first observe that since the
Consider a key
We can now recall and give a detailed proof of Proposition 3.
Consider a voting scheme
We will show that under our assumptions we have
Fix some index
To prepare the terrain for applying this assumption later on, we define two additional processes
Thanks to the assumption that
By contradiction, let us now assume that
Moreover, for any
In addition, by action-determinacy,
Our next step is to construct a sequence of actions
Intuitively, the attacker interacting with
A subtle detail is that when constructing this ballot, the attacker will not be able to use the same private name
Due to the form of the processes, we can assume w.l.o.g. that
We now define recipes that we will use to let the attacker compute ballots for honest voters simulated by dishonest ones. For any
Let any input any input
By construction of If the validity check in a If the validity check in a
Executing
Note that, by construction, the recipes
The last step of our proof will be to describe the relation between
We construct a frame of recipes
For For all if if if For all If If If If Finally, the only remaining variable is the left ballot of voter the ballots given as input to dishonest voters the ballots given as input to dishonest voters Hence, the randomness-independence property (Lemma 4) applies and guarantees that tallying the ballots in
Using property (1), we can now conclude the proof. Indeed, we know that
On the other hand,
This second reduction result allows one to bind the number of dishonest voters when considering
Let
Roughly, if
In case (ii), we know that the public terms representing the final result are different on both sides. We apply our
We now give a detailed proof of Proposition 4.
First, relying on Lemma 3, we know that the processes under study are action-deterministic, and therefore, thanks to Proposition 1, we can assume that a witness of an attack of minimal length has some specific shape. Following the notation introduced in Section 3, we consider
We are going to show that this minimal witness
In the following, we will distinguish cases depending on the form of We first consider the case where some actions in phase 2 are performed by a dishonest voter We now assume that there is no input/output/append action performed by a dishonest voter during the casting phase (phase If during phase Otherwise, if phase We now consider the case of a trace
Hence the result.
We now consider the case where re-voting is allowed. We first adapt the
Modelling BPRIV with re-voting
The processes
We reuse the notations from Section 3.2, and we introduce in addition
Note that a replication operator has been added in front of the voter processes to model the fact that revote is now possible.
Reduction result with re-voting
We are now able to state our reduction result when considering re-voting.
Let
The proof of this theorem follows the same lines as the one when revote is not allowed and is composed of two main reduction steps. Before performing these two reduction steps, we may note that our election processes are still action-deterministic. Actually, the construction
If, during phase 2, some actions (e.g. Now, in case phase 1 contains an action of the form We now consider the case of a trace
Even after applying our reduction result, we may note that replication operators are still there, and thus establishing such an equivalence property (even when
Extension to the case of a dishonest ballot box
We now consider the case where the ballot box is no longer trusted. As in the previous section, we first adapt the
Symbolic BPRIV with a dishonest ballot box
The symbolic definition we propose in Section 3.2, based on the original game-based formulation of Bernhard et al. (11), considers a setting where the ballot box is trusted. Indeed, it does not give the attacker complete control over the contents of the ballot box: the attacker cannot arbitrarily write to the ballot box, but rather can only see and block honest ballots, and cast ballots in the name of dishonest voters only.
In (12), an extension of
We propose here an adaptation of our symbolic
Voter processes
We update the
Since we give the attacker complete control over the ballot box, there is no longer a need for a
Validity check
The ballot box to be tallied will be input directly from the attacker. Since the attacker has direct write access to the ballot box, we first check he has not added invalid ballots to it, and that only one ballot has been submitted per voter (as we exclude revote here).To do so, we will use the check that retrieve the associated check that
In addition, for each
Recovery
Before actually performing the tally, the recovery operation must be performed on the right-hand side. Depending on the way this recovery is done, the resulting definition expresses stronger or weaker guarantees. For this reason, the computational
In order to keep our symbolic definition similarly generic, we simply assume a computation
A typical recovery algorithm consists of going through the list submitted by the attacker, and, for each ballot, checking if it is equal to a ‘fake’ ballot generated by an honest voter. If so, the ballot is replaced with the corresponding ‘real’ ballot for that voter, otherwise, it is left unchanged. That algorithm, as discussed in Cortier et al. (12), leads to a property expressing that the attacker cannot modify honest ballots, but only choose to include them or to remove them – essentially, the ballots must be non-malleable.
We can encode it as the following recovery computation
Assumptions on the recovery computation
Verification tools tend to reason more easily on processes with a similar structure: for this reason, we include the recovery computation on both sides of the equivalence. On the left-hand side,
Moreover, we will need for our reduction result to assume that
for any permutation for any partition
Intuitively that property means that the recovery operation does not depend on the order in which ballots are cast, and can be computed piecewise on a partition of the ballots.
In addition, we assume that
Finally, we add another, more restrictive assumption on the recovery computation.
These assumptions, up to the last one, do not seem overly restrictive and hold for all recovery algorithms considered in (12). The final assumption, on the other hand, is much more restrictive: it typically prevents the recovery computation from adding back to the ballot box some ballots that may have been removed by the attacker. Cortier et al. (12) make use of such recovery when modelling variants of the property that express additional verifiability guarantees – basically, by having the recovery add all ballots from the voters who check their vote. We forbid such recovery computations here, meaning that our reduction result does not hold when considering verifiability guarantees. These typically require that a subset of honest voters perform verifications, and get additional assurance that their vote is counted. It is not particularly surprising that these guarantees cannot be captured by only one honest voter.
The computation
against a dishonest ballot box
Overall, the
We are now able to state our reduction result when considering a dishonest ballot box.
Let
The proof of this result follows the same steps as the one for the honest ballot box, with the same two main reductions. Note first that the election processes we defined for the dishonest ballot box case are still action-deterministic. In fact, Proposition 1 still holds for these new election processes. Rather than redoing the entire reduction proof, we detail here only the differences with the honest ballot box case.
From Proposition 1, We can define for each Just as in the previous proof, we let We first let Our goal is to choose The end of the proof is then just as in the honest ballot box case: we show that The proof is very similar to the one for Proposition 4, in the case of an honest ballot box. Consider a minimal-length attack trace We will now show that As in the case of the honest ballot box, we can distinguish three cases, depending on which point of the execution
By construction of the election process, both ballot boxes pass the validity checks of By the same reasoning as in the honest ballot box case, the static non-equivalence of The two lists Then, by assumption on These indices point to Consider the trace Consider also the
To illustrate the generality of our result, and to showcase how useful it can be in practice, we apply it to several well-known voting protocols from the literature considering different counting functions. In this section, we first present the counting functions, as well as the e-voting protocols that we consider for our analysis. Then, we discuss the results we obtained relying on the
Counting functions under study
We apply our results to several case studies considering different counting functions. We have already introduced some classical counting functions in Section 4.1, namely multiset, sum, and majority, and we have shown that they are
STV is a system where each voter casts a single ballot containing a total ordering of all candidates. A vote goes to the voter’s first choice. If that choice is later eliminated, instead of being thrown away, the vote is transferred to her second choice, and so on. In each round, the least popular candidate is eliminated. His votes are transferred based on voters’ subsequent choices. The process is repeated until one candidate remains, who is declared the winner. We assume a total order
Let
We assume that
There exists Otherwise, for Hence, Either Or Or
Therefore, there exists
We conclude that at most five votes are needed to ensure the result will be different.
In the following, we consider majority, multiset, sum, and STV (restricted to three candidates).
For our case study, we chose the following protocols: two variants of Helios (1), corresponding to its original version, subject to the attack discussed earlier, and a fixed version that includes identities in the ZKP; Belenios (28), and the related BeleniosRF (29) and BeleniosVS (20); Civitas (30); and Prêt-à-Voter (31; 32). Some of the protocols (notably Helios and Belenios) can make use of homomorphic encryption, so that all encrypted votes can be summed before decryption. In our case study, however, we only consider the mixnet version of these protocols, where ballots are instead mixed in a random order before decryption. Indeed, even if our reduction results apply in the presence of homomorphic encryption,
This attack can be mitigated in two ways. The first one consists of adding the voter’s identity to the ZKP, preventing the attacker from replaying it in his name. The second one is weeding, that is, adding a mechanism allowing the server to remove duplicate ballots. However, this operation is only effective if we assume that the ballots emitted by honest voters correctly reach the ballot box. Indeed, the attacker can otherwise simply block the original ballot, and still break the privacy property without the server being able to detect the copy. In our communication model, we decided not to make such a strong assumption, and the attacker is thus able to block messages. Therefore, in our framework, weeding is not a solution to prevent the replay attack mentioned above. We do not study this version of Helios, as the validity test with weeding does not fall in our framework described in Section 3, since we require that the validity test does not depend on the current content of the ballot box.
Note that the framework proposed in (6) makes the assumption that the ballots meant to be counted for each honest voter (typically the last one emitted), which are the ones swapped to express the privacy property, reach the ballot box. Against that weaker attacker, Helios with weeding is secure, as long as voters only vote once. It is not, though, if they revote: the replay could be performed using a previous ballot, which the attacker is allowed to block. Their security analysis correctly finds Helios with weeding secure when revote is disallowed. However, due to a misunderstanding of the reduction result, a replication operator is missing in the case of revote, which leads to the attack being missed in that case.
Results
We conducted the analysis for different counting functions, using our result to bind the number of agents and ballots. We considered majority, multiset, sum, and STV (restricted to three candidates). In fact, in the case of 1-bounded functions, since only one ballot needs to be accepted by the ballot box, the tallying is trivial and ends up being the same for different functions (majority, multiset, etc.). Thus, a single
We modelled the protocols briefly presented in Section 7.2 as processes satisfying our assumptions, and analysed them using
In some cases, we made slight adjustments to the protocols, so that they fit our framework. Detailed explanations for these modelling choices can be found in the files. All model files for our case study are available in (36). The results are presented in Table 1.
Overall, as can be seen in the table, our result allows for efficient verification of all protocols we considered. Thanks to the small bounds we establish, we get even better performance than previous work (6) in scenarios where that result applies – that is, the first column, for multiset counting. In that case, some analyses took several hours/days in (6), due to the higher bounds. Our result is more general and can handle, for example, STV counting. On most tested protocols, performance remains acceptable in that case. However
Conclusion
We have proposed a symbolic version of the
As mentioned earlier, a limitation of our definition is the modelling of the correct tallying proofs, which we abstracted away. In the computational definition, they are handled using simulators. It remains to be seen whether such techniques can be adapted to the symbolic setting, and how.
Vote privacy is considered a fundamental security property for electronic voting schemes. It is of course not the only desirable one: in particular, receipt-freeness and coercion-resistance can be seen as stronger variants of privacy that require that an attacker should be unable to ascertain the voters’ choice, even when they are willing, or coerced, to reveal their vote. Computational game-based definitions (30) as well as symbolic ones (3) have been proposed for these properties. They are, however, written in the same spirit as
Footnotes
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work received funding from the France 2030 program managed by the French National Research Agency under grant agreement No. ANR-22-PECY-0006.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
