Automated analysis of security protocols with global state

Abstract

Security APIs, key servers and protocols that need to keep the status of transactions, require to maintain a global, non-monotonic state, e.g., in the form of a database or register. However, most existing automated verification tools do not support the analysis of such stateful security protocols – sometimes because of fundamental reasons, such as the encoding of the protocol as Horn clauses, which are inherently monotonic. A notable exception is the recent tamarin prover which allows specifying protocols as multiset rewrite (msr) rules, a formalism expressive enough to encode state. As multiset rewriting is a “low-level” specification language with no direct support for concurrent message passing, encoding protocols correctly is a difficult and error-prone process.

We propose a process calculus which is a variant of the applied pi calculus with constructs for manipulation of a global state by processes running in parallel. We show that this language can be translated to msr rules whilst preserving all security properties expressible in a dedicated first-order logic for security properties. The translation has been implemented in a prototype tool which uses the tamarin prover as a backend. We apply the tool to several case studies among which a simplified fragment of PKCS#11, the Yubikey security token, and an optimistic contract signing protocol.

Keywords

Automated verification stateful security protocols security APIs

1. Introduction

Automated analysis of security protocols has been extremely successful. Using automated tools, flaws have been for instance discovered in the Google Single Sign On Protocol [5], in commercial security tokens implementing the PKCS#11 standard [10], and one may also recall Lowe’s attack [25] on the Needham-Schroeder public key protocol 17 years after its publication. While efficient tools such as ProVerif [7], AVISPA [4] or Maude-NPA [14] exist, these tools are generally not suitable to analyze protocols that require non-monotonic global state, i.e., some database, register or memory location that can be read and altered by different parallel threads. The input language of the AVISPA tool offers support for this kind of state but only supports a bounded number of sessions. This is particularly restrictive when analysing security APIs where attacks typically require several keys and API calls, which are difficult to bound a priori. ProVerif, one of the most efficient and widely used protocol analysis tools for an unbounded number of sessions, relies on an abstraction that encodes protocols in first-order Horn clauses. This abstraction is well suited for the monotonic knowledge of an attacker (who never forgets), makes the tool extremely efficient for verifying an unbounded number of protocol sessions and allows to build on existing techniques for Horn clause resolution. However, Horn clauses are inherently monotonic: once a fact is true it cannot be set to false anymore. As a result, even though ProVerif’s input language, a variant of the applied pi calculus [2], allows a priori encodings of a global memory, the abstractions performed by ProVerif introduce false attacks. In the ProVerif user manual [8, Section 6.3.3] such an encoding of memory cells and its limitations are indeed explicitly discussed:

“Due to the abstractions performed by ProVerif, such a cell is treated in an approximate way: all values written in the cell are considered as a set, and when one reads the cell, ProVerif just guarantees that the obtained value is one of the written values (not necessarily the last one, and not necessarily one written before the read).”

Some work [3,12,26] has nevertheless used ingenious encodings of mutable state in Horn clauses, but these encodings have limitations that we discuss below.

A prominent example where non-monotonic global state appears are security APIs, such as the RSA PKCS#11 standard [27], IBM’s CCA [11] or the trusted platform module (TPM) [35]. They have been known to be vulnerable to logical attacks for some time [9,24] and formal analysis has shown to be a valuable tool to identify attacks and find secure configurations. One promising paradigm for analyzing security APIs is to regard them as a participant in a protocol and use existing analysis tools. However, Herzog [18] already identified not accounting for mutable global state as a major barrier to the application of security protocol analysis tools to verify security APIs. Apart from security APIs many other protocols need to maintain databases: key servers need to store the status of keys, in optimistic contract signing protocols a trusted party maintains the status of a contract, RFID protocols maintain the status of tags and more generally websites may need to store the current status of transactions.

Our contributions. We propose a tool for analyzing protocols that may involve non-monotonic global state, relying on Schmidt et al.’s tamarin tool [30,31] as a backend. We designed a new process calculus that extends the applied pi calculus by defining, in addition to the usual constructs for specifying concurrent processes, constructs for explicitly manipulating global state. This calculus serves as the tool’s input language. The heart of our tool is a translation from this extended applied pi calculus to a set of multiset rewrite rules that can then be analyzed by tamarin which we use as a backend. We prove the correctness of this translation and show that it preserves all properties expressible in a dedicated first order logic for expressing security properties. As a result, relying on the tamarin prover, we can analyze protocols without bounding the number of sessions, nor making any abstractions. Moreover it allows to model a wide range of cryptographic primitives by the means of equational theories. As the underlying verification problem is undecidable, tamarin may not terminate. However, it offers an interactive mode with a GUI which allows to manually guide the tool in its proof. Our specification language includes support for private channels, global state and locking mechanisms (which are crucial to write meaningful programs in which concurrent threads manipulate a common memory). The translation has been carefully engineered in order to favor termination by tamarin, including a goal ranking method tailored to the output of the translation. Several case studies illustrate the tool’s capability: a simple security API in the style of PKCS#11, a complex case study of the Yubikey security token, as well as several examples analyzed by other tools that aim at analyzing stateful protocols. In all of these case studies we were able to avoid restrictions that were necessary in previous works.

Related work. The most closely related work is the StatVerif tool by Arapinis et al. [3]. They propose an extension of the applied pi calculus, similar to ours, which is translated to Horn clauses and analyzed by the ProVerif tool. Their translation is sound but allows for false attacks, limiting the scope of protocols that can be analyzed. Moreover, StatVerif can only handle a finite number of memory cells: when analyzing an optimistic contract signing protocol this appeared to be a limitation and only the status of a single contract was modeled, providing a manual proof to justify the correctness of this abstraction. In important case studies, e.g. key-management APIs like PKCS#11 or the Yubikey, an unbounded amount of memory is required to avoid artificially bounding the number of keys or Yubikey devices. Finally, StatVerif is limited to the verification of secrecy properties. As illustrated by the Yubikey case study, our work is more general and we are able to analyze complex injective correspondence properties.

Mödersheim [26] proposed a language with support for sets together with an abstraction where all objects that belong to the same sets are identified. His language, which is an extension of the low level AVISPA intermediate format, is compiled into Horn clauses that are then analyzed, e.g., using ProVerif. His approach is tightly linked to this particular abstraction, limiting the scope of applicability, e.g., when keys may be compromised (all keys with the same attributes are abstracted to one and the same, thus either all are revealed, or none) or when the set of states a key or value is not bounded a priori (as in the Yubikey case study). Mödersheim also discusses the need for a more high-level specification level which we provide in this work.

There has also been work tailored to particular applications. In [13], Delaune et al. show by a dedicated hand proof that for analyzing PKCS#11 one may bound the message size. Their analysis still requires to artificially bound the number of keys. Similarly in spirit, Delaune et al. [12] give a dedicated result for analyzing protocols based on the TPM and its registers. However, the number of reboots (which reinitialize registers) needs to be limited.

Guttman [17] also extended the strand space model by adding support for state. While the protocol execution is modeled using the classical strand spaces model, state is modeled by a multiset of facts, and manipulated by multiset rewrite rules. The extended model has been used for analyzing by hand an optimistic contract signing protocol. In a more recent paper Ramsdell et al. [28] propose another approach also based on the strand space model. Using the CPSA tool they obtain a symbolic representation (called skeletons) of all possible attacks. However, as their model analyzed by CPSA encodes the state in a message passing style, the tool may consider false attacks. They therefore import the CPSA result, as an axiom, in the theorem prover PVS and, based on a more precise model of the possible state transitions, refine their analysis to exclude the false attacks. The approach has been applied to the so-called envelope protocol, which was also analysed (in a slightly more restrictive model) in [12].

In the goal of relating different approaches for protocol analysis Bistarelli et al. [6] also proposed a translation from a process algebra to multiset rewriting: they do however not consider private channels, have no support for global state and assume that processes have a particular structure. These limitations significantly simplify the translation and its correctness proof. Moreover their work does not include any tool support for automated verification.

Obviously any protocol that we are able to analyze can be directly analyzed by the tamarin prover [30,31] as the rules produced by our translation could have been given directly as an input to tamarin. Indeed, tamarin has already been used for analyzing a model of the Yubikey device [23], the case studies presented with Mödersheim’s abstraction, as well as those presented with StatVerif. It is furthermore able to reproduce the aforementioned results on PKCS#11 [13] and the TPM [12] – moreover, it does so without bounding the number of keys, security devices, reboots, etc. Contrary to ProVerif, tamarin sometimes requires additional typing lemmas which are used to guide the proof. These lemmas need to be written by hand (but are proved automatically). In our case studies we also needed to provide a few such lemmas manually. In our opinion, an important disadvantage of tamarin is that protocols are modeled as a set of multiset rewrite rules. This representations is very low level and far away from actual protocol implementations, making it very difficult to model a protocol adequately. Encoding private channels, nested replications and locking mechanisms directly as multiset rewrite rules is a tricky and error prone task. As a result we observed that, in practice, the protocol models tend to be simplified. For instance, locking mechanisms are often omitted, modeling protocol steps as a single rule and making them effectively atomic. Such more abstract models may obscure issues in concurrent protocol steps and increase the risk of implicitly excluding attacks in the model that are well possible in a real implementation, e.g., race conditions. Using a more high-level specification language, such as our process calculus, arguably eases protocol specification and overcomes some of these risks. Examples in which the explicit modelling of locking mechanisms in SAPIC improved the protocol and/or the analysis include the Yubikey case study presented in Section 7. In our modelling of the Yubikey the server can handle several requests from different devices in parallel, which was not possible in the direct modelling in [23]. Another example is the model of the enhanced authorization mechanism introduced in the TPM 2.0 specification by Shao et al. [33]. In this work, a model of the TPM that executes API commands sequentially is compared to one that executes them in parallel, finding flaws in the parallel version. The TPM model in the tamarin example files models TPM commands as atomic steps. While an explicit modelling of locking steps is possible in tamarin, judging from existing models, it is not widely used, although protocols and analysis could benefit from it.

Since the first prototype of this translation was presented [21], subsequent work has demonstrated and extended its scope. The present calculus and verification method have been used to verify a configuration of the key-management API PKCS#11 [22] and was extended with loops to allow for the analysis of the streaming protocol TESLA [19]. In [33], Shao et al. used our tool to analyse the enhanced authorization mechanism introduced in the TPM 2.0 specification.

2. Preliminaries

Terms and equational theories. As usual in symbolic protocol analysis we model messages by abstract terms. Therefore we define an order-sorted term algebra with the sort $msg$ and two incomparable subsorts $pub$ and $fresh$ . For each of these subsorts we assume a countably infinite set of names, $FN$ for fresh names and $PN$ for public names. Fresh names will be used to model cryptographic keys and nonces while public names model publicly known values. We furthermore assume a countably infinite set of variables for each sort s, $V_{s}$ and let $V$ be the union of the set of variables for all sorts. We write $u : s$ when the name or variable u is of sort s. Let Σ be a signature, i.e., a set of function symbols, each with an arity. We write $f / n$ when function symbol f is of arity n. We denote by $T_{Σ}$ the set of well-sorted terms built over Σ, $PN$ , $FN$ and $V$ . For a term t we denote by $names (t)$ , respectively $vars (t)$ the set of names, respectively variables, appearing in t. The set of ground terms, i.e., terms without variables, is denoted by $M_{Σ}$ . When Σ is fixed or clear from the context we often omit it and simply write $T$ for $T_{Σ}$ and $M$ for $M_{Σ}$ .

We equip the term algebra with an equational theory E, that is a finite set of equations of the form $M = N$ where $M, N \in T$ . From the equational theory we define the binary relation $=_{E}$ on terms, which is the smallest equivalence relation containing equations in E that is closed under application of function symbols, bijective renaming of names and substitution of variables by terms of the same sort. Furthermore, we require E to distinguish different fresh names, i.e., $\forall a, b \in FN : a \neq b \Rightarrow a \neq_{E} b$ .

Example.
Symmetric encryption can be modelled using a signature $Σ = {senc / 2, sdec / 2}$ and an equational theory defined by $sdec (senc (m, k), k) = m .$

For the remainder of the article we assume that E refers to some fixed equational theory and that the signature and equational theory always contain symbols and equations for pairing and projection, i.e., ${⟨ \cdot, \cdot ⟩, fst, snd} \subseteq Σ$ and equations $fst (⟨ x, y ⟩) = x$ and $snd (⟨ x, y ⟩) = y$ are in E. We will sometimes use $⟨ x_{1}, x_{2}, \dots, x_{n} ⟩$ as a shortcut for $⟨ x_{1}, ⟨ x_{2}, ⟨ \dots, ⟨ x_{n - 1}, x_{n} ⟩ \dots ⟩$ .

We suppose the usual notion of positions for terms. A position p is a sequence of positive integers and $t |_{p}$ denotes the subterm of t at position p.

Facts. We also assume an unsorted signature $Σ_{fact}$ , disjoint from Σ. The set of facts is defined as $\begin{matrix} F : = {F (t_{1}, \dots, t_{k}) ∣ t_{i} \in T_{Σ}, F \in Σ_{fact} of arity k} . \end{matrix}$ Facts will be used both to annotate protocols, by the means of events, and for defining multiset rewrite rules. We partition the signature $Σ_{fact}$ into linear and persistent fact symbols. We suppose that $Σ_{fact}$ always contains a persistent, unary symbol $K$ and a linear, unary symbol $Fr$ . Given a sequence or set of facts S we denote by $lfacts (S)$ the multiset of all linear facts in S and $pfacts (S)$ the set of all persistent facts in S. By notational convention facts whose identifier starts with ‘!’ will be persistent. $G$ denotes the set of ground facts, i.e., the set of facts that does not contain variables. For a fact f we denote by $ginsts (f)$ the set of ground instances of f. This notation is also lifted to sequences and sets of facts as expected.

Predicates. We assume an unsorted signature $Σ_{pred}$ of predicate symbols that is disjoint from Σ and $Σ_{fact}$ . The set of predicate formulas is defined as $\begin{matrix} P : = {pr (t_{1}, \dots, t_{k}) ∣ t_{i} \in T_{Σ}, pr \in Σ_{pred} of arity k} . \end{matrix}$ Predicate formulas will be used to describe branching conditions in protocols. The semantics of a predicate is defined via a first-order formula over atoms of the form $t_{1} \approx t_{2}$ , i.e. the grammar for such formulae is $\begin{matrix} ⟨ ϕ ⟩ ::= t_{1} \approx t_{2} ∣ \neg ϕ ∣ ϕ_{1} \land ϕ_{2} ∣ \exists x . ϕ, \end{matrix}$ where $t_{1}$ , $t_{2}$ are terms and $x \in V$ . For an n-ary predicate symbol $pr$ , $pr (x_{1}, \dots, x_{n})$ is defined by a formula $ϕ_{pr}$ such that $fv (ϕ_{pr}) \subseteq x_{1}, \dots, x_{n}$ , where $fv$ denotes the free variables in a formula, i.e., variables $v \in V$ not bound by $\exists v$ . The semantics of the first-order formulae is as usual where we interpret ≈ as $=_{E}$ .
Example.
Suppose $encSucc \in Σ_{pred}$ is a binary predicate symbol. We can define it as follows, so that it allows to check whether a term $x_{1}$ was encrypted using a key $x_{2}$ : $\begin{matrix} ϕ_{encSucc} = \exists m . enc (m, x_{2}) \approx x_{1} . \end{matrix}$

Substitutions. A substitution σ is a partial function from variables to terms. We suppose that substitutions are well-typed, i.e., they only map variables of sort s to terms of sort s, or of a subsort of s. We denote by $σ = {^{t_{1}} /_{x_{1}}, \dots,^{t_{n}} /_{x_{n}}}$ the substitution whose domain is $D (σ) = {x_{1}, \dots, x_{n}}$ and which maps $x_{i}$ to $t_{i}$ . As usual we homomorphically extend σ to apply to terms and facts and use a postfix notation to denote its application, e.g., we write $t σ$ for the application of σ to the term t. A substitution σ is grounding for a term t if $t σ$ is ground. Given function g we let $g (x) = ⊥$ when $x \notin D (x)$ . When $g (x) = ⊥$ we say that g is undefined for x. We define the function $f : = g [a \mapsto b]$ with $D (f) = D (g) \cup {a}$ as $f (a) : = b$ and $f (x) : = g (x)$ for $x \neq a$ .

Sets, sequences and multisets. We write $N_{n}$ for the set ${1, \dots, n}$ . Given a set S we denote by $S^{*}$ the set of finite sequences of elements from S and by $S^{#}$ the set of finite multisets of elements from S. We use the superscript $^{#}$ to annotate usual multiset operation, e.g. $S_{1} \cup^{#} S_{2}$ denotes the multiset union of multisets $S_{1}$ , $S_{2}$ . Given a multiset S we denote by $set (S)$ the set of elements in S. The sequence consisting of elements $e_{1}, \dots, e_{n}$ will be denoted by $[e_{1}, \dots, e_{n}]$ and the empty sequence is denoted by $[]$ . We denote by $| S |$ the length, i.e., the number of elements of the sequence. We use · for the operation of adding an element either to the start or to the end, e.g., $e_{1} \cdot [e_{2}, e_{3}] = [e_{1}, e_{2}, e_{3}] = [e_{1}, e_{2}] \cdot e_{3}$ . Given a sequence S, we denote by $idx (S)$ the set of positions in S, i.e., $N_{n}$ when S has n elements, and for $i \in idx (S)$ $S_{i}$ denotes the ith element of the sequence. Set membership modulo E is denoted by $\in_{E}$ and defined as $e \in_{E} S$ iff $\exists e^{'} \in S . e^{'} =_{E} e$ . $\subset_{E}$ and $=_{E}$ are defined for sets in a similar way. Application of substitutions are lifted to sets, sequences and multisets as expected. By abuse of notation we sometimes interpret sequences as sets or multisets; the applied operators should make the implicit cast clear.
3. A cryptographic pi calculus with explicit state

3.1. Syntax and informal semantics

Our calculus, dubbed SAPiC (Stateful Applied Pi calculus) is a variant of the applied pi calculus [2]. In addition to the usual operators for concurrency, replication, communication and name creation, it offers several constructs for reading and updating an explicit global state. The grammar for processes is described in Fig. 1.

Fig. 1.

Syntax, where $M, N \in T$ and $Pred \in P$ .

0 denotes the terminal process. $P ∣ Q$ is the parallel execution of processes P and Q and $! P$ the replication of P, allowing an unbounded number of sessions in protocol executions. The construct $ν n; P$ binds the name $n \in FN$ in P and models the generation of a fresh, random value. The processes out( $M, N$ ); P and in( $M, N$ ); P represent the output, respectively input, of message N on channel M. Readers familiar with the applied pi calculus [2] may note that we opted for the possibility of pattern matching in the input construct, rather than merely binding the input to a variable x. The process if $Pred$ then P else Q will execute P or Q, depending on whether $Pred$ holds. For example, if $Pred = equal (M, N)$ , and $ϕ_{equal} = x_{1} \approx x_{2}$ , then if $equal (M, N)$ then P else Q will execute P if $M =_{E} N$ and Q otherwise. (In the following, we will use $M = N$ as short-hand for $equal (M, N)$ .) The event construct is merely used for annotating processes and will be useful for stating security properties. For readability we sometimes omit to write else Q when Q is 0, as well as trailing 0 processes.

The remaining constructs are used for manipulating state and are new compared to the applied pi calculus. The construct insert M,N binds the value N to a key M. Successive inserts allow changing this binding. We emphasise that we have only one value bound to a key, and that successive inserts update the binding. The delete M operation simply “undefines” the mapping for the key M. The lookup M as x in P else Q allows for retrieving the value associated to M, binding it to the variable x in P. If the mapping is undefined for M the process behaves as Q. The lock and unlock constructs are used to gain or waive exclusive access to a resource M, in the style of Djkstra’s binary semaphores: if a term M has been locked, any subsequent attempt to lock M will be blocked until M has been unlocked. This is essential for writing protocols where parallel processes may read and update a common memory.

In the following example, which will serve as our running example, we model a security API that, even though much simplified, illustrates the most salient issues that occur in the analysis of security APIs such as PKCS#11 [10,13,15].

Example.

We consider a security device that allows the creation of keys in its secure memory. The user can access the device via an API. If he creates a key, he obtains a handle, which he can use to let the device perform operations on his behalf. For each handle the device also stores an attribute which defines what operations are permitted for this handle. The goal is that the user can never gain knowledge of the key, as the user’s machine might be compromised. We model the device by the following process (we use $out (m)$ as a shortcut for $out (c, m)$ for a public channel c): $\begin{matrix} ! P_{new} ∣! P_{set} ∣! P_{dec} ∣! P_{wrap}, where \end{matrix}$

In the first line, the device creates a new handle h and a key k and, by the means of the event NewKey $(h, k)$ , logs the creation of this key. It then stores the key that belongs to the handle by associating the pair $⟨ ‘key’, h ⟩$ to the value of the key k. In the next line, $⟨ ‘att’, h ⟩$ is associated to a public constant $‘dec’$ . Intuitively, we use the public constants $‘key’$ and $‘att’$ to distinguish two databases. The process

allows the attacker to change the attribute of a key from the initial value $‘dec’$ to another value $‘wrap’$ . If a handle has the $‘dec’$ attribute set, it can be used for decryption:

The first lookup stores the value associated to $⟨ ‘att’, h ⟩$ in a. The value is compared against $‘dec’$ . If the comparison and another lookup for the associated key value k succeeds, we check whether decryption succeeds and, if so, output the plaintext.

If a key has the $‘wrap’$ attribute set, it might be used to encrypt the value of a second key, e.g., to export the key for external storage:

The bound names of a process are those that are bound by $ν n$ . We suppose that all names of sort $fresh$ appearing in the process are under the scope of such a binder. Free names must be of sort $pub$ . A variable x can be bound in two ways: (i) by the construct lookup M as x, or (ii) $x \in vars (N)$ in the construct in( $M, N$ ) and x is not under the scope of a previous binder. While the construct lookup M as x always acts as a binder, the input construct does not rebind an already bound variable but performs pattern matching. For instance in the process $\begin{matrix} P = in (c, f (x)); in (c, g (x)) \end{matrix}$ x is bound by the first input and pattern matched in the second. It might seem odd that lookup acts as a binder, while input does not. We justify this decision as follows: as $P_{dec}$ and $P_{wrap}$ in the previous example show, lookups appear often after input was received. If lookup were to use pattern matching, the following process $\begin{matrix} P = in(c, x); lookup ‘ store ’ as x in P^{'} \end{matrix}$ might unexpectedly perform a check if ‘store’ contains the message given by the adversary, instead of binding the content of ‘store’ to x, due to an undetected clash in the naming of variables.

A process is ground if it does not contain any free variables. We denote by $P σ$ the application of the homomorphic extension of the substitution σ to P. As usual we suppose that the substitution only applies to free variables. We sometimes interpret the syntax tree of a process as a term and write $P |_{p}$ to refer to the subprocess of P at position p (where ∣, if and lookup are interpreted as binary symbols, all other constructs as unary). Our tool supports additional syntactic sugar: else-branches consisting of the 0-Process can be omitted, as well as let-construct for terms (let $m = dec (c, k)$ in out(m)) and processes (let $P = \dots$ in !P) perform simple substitution.

3.2. Semantics

Frames and deduction. Before giving the formal semantics of SAPiC we introduce the notions of frame and deduction. A frame consists of a set of fresh names $\tilde{n}$ and a substitution σ and is written $ν \tilde{n} . σ$ . Intuitively a frame represents the sequence of messages that have been observed by an adversary during a protocol execution and secrets $\tilde{n}$ generated by the protocol, a priori unknown to the adversary. Deduction models the capacity of the adversary to compute new messages from the observed ones.

Definition 1 (Deduction).

We define the deduction relation $ν \tilde{n} . σ ⊢ t$ as the smallest relation between frames and terms defined by the deduction rules in Fig. 2.

Fig. 2.

Deduction rules.

Fig. 3.

Proof tree witnessing that $ν \tilde{n} . σ ⊢ k_{2}$ , where $c = senc (k_{2}, k_{1})$ .

Example.

If one key is used to wrap a second key, then, if the intruder learns the first key, he can deduce the second. For $\tilde{n} = k_{1}, k_{2}$ and $σ = {^{senc (k_{2}, k_{1})} /_{x_{1}},^{k_{1}} /_{x_{2}}}$ , $ν \tilde{n} . σ ⊢ k_{2}$ , as witnessed by the proof tree given in Fig. 3.

Operational semantics. We can now define the operational semantics of our calculus. The semantics is defined by a labelled transition relation between process configurations. A process configuration is a 5-tuple $(E, S, P, σ, L)$ where

$E \subseteq FN$ is the set of fresh names generated by the processes;

$S : M_{Σ} \to M_{Σ}$ is a partial function modeling the store;

$P$ is a multiset of ground processes representing the processes executed in parallel;

σ is a ground substitution modeling the messages output to the environment;

$L \subseteq M_{Σ}$ is the set of currently acquired locks.

The transition relation is defined by the rules described in Fig. 4. Transitions are labelled by sets of ground facts. For readability we omit empty sets and brackets around singletons, i.e., we write → for $\overset{\emptyset}{⟶}$ and $\overset{f}{⟶}$ for $\overset{{f}}{⟶}$ . We write $\to^{*}$ for the reflexive, transitive closure of → (the transitions that are labelled by the empty sets) and write $\overset{f}{\Rightarrow}$ for $\to^{*} \overset{f}{\to} \to^{*}$ . We can now define the set of traces, i.e., possible executions that a process admits.

Definition 2 (Traces of P).

Given a ground process P we define the set of traces of P as $\begin{matrix} \begin{matrix} {traces}^{pi} (P) \\ = {[F_{1}, \dots, F_{n}] ∣ (\emptyset, \emptyset, {P}, \emptyset, \emptyset) \overset{F_{1}}{⟹} (E_{1}, S_{1}, P_{1}, σ_{1}, L_{1}) \overset{F_{2}}{⟹} \dots \overset{F_{n}}{⟹} (E_{n}, S_{n}, P_{n}, σ_{n}, L_{n})} . \end{matrix} \end{matrix}$

Fig. 4.

Operational semantics.

Example.

In Fig. 5 we display the transitions corresponding to the creation of a key on the security device in our running example and witness that $[NewKey (h^{'}, k^{'})] \in {traces}^{pi} (P)$ .

Fig. 5.

Example of transitions modelling the creation of a key on a PKCS#11-like device.

4. Labelled multiset rewriting

We now recall the syntax and semantics of labelled multiset rewriting rules, which constitute the input language of the tamarin tool [30].

Definition 3 (Multiset rewrite rule).

A labelled multiset rewrite rule $r i$ is a triple $(l, a, r)$ , $l, a, r \in F^{*}$ , written $l - [a] \to r$ . We call $l = prems (ri)$ the premises, $a = actions (ri)$ the actions, and $r = conclusions (ri)$ the conclusions of the rule.

Definition 4 (Labelled multiset rewriting system).

A labelled multiset rewriting system is a set of labelled multiset rewrite rules R, such that each rule $l - [a] \to r \in R$ satisfies the following conditions:

$l, a, r$ do not contain fresh names and

r does not contain $Fr$ -facts.

A labelled multiset rewriting system is called well-formed, if additionally

for each $l^{'} - [a^{'}] \to r^{'} \in_{E} ginsts (l - [a] \to r)$ we have that $⋂_{r^{″} =_{E} r^{'}} names (r^{″}) \cap FN \subseteq ⋂_{l^{″} =_{E} l^{'}} names (l^{″}) \cap FN$ .

We define one distinguished rule Fresh which is the only rule allowed to have $Fr$ -facts on the right-hand side $\begin{matrix} F RESH : [] - [] \to [Fr (x : fresh)] . \end{matrix}$

The semantics of the rules is defined by a labelled transition relation.

Definition 5 (Labelled transition relation).

Given a multiset rewriting system R we define the labeled transition relation $\to_{R} \subseteq G^{#} \times P (G) \times G^{#}$ as $\begin{matrix} S {\overset{a}{⟶}}_{R} ((S ∖^{#} lfacts (l)) \cup^{#} r) \end{matrix}$ if and only if $l - [a] \to r \in_{E} ginsts (R \cup F RESH)$ , $lfacts (l) \subseteq^{#} S$ and $pfacts (l) \subseteq S$ .

Definition 6 (Executions).

Given a multiset rewriting system R we define its set of executions as $\begin{array}{l} {exec}^{msr} (R) & = {\emptyset {\overset{A_{1}}{⟶}}_{R} \dots {\overset{A_{n}}{⟶}}_{R} S_{n} ∣ \forall a, i, j : 0 ⩽ i \neq j < n . \\ (S_{i + 1} ∖^{#} S_{i}) = {Fr (a)} \Rightarrow (S_{j + 1} ∖^{#} S_{j}) \neq {Fr (a)}} . \end{array}$

The set of executions consists of transition sequences that respect freshness, i.e., for a given name a the fact $Fr (a)$ is only added once, or in other words the rule $F RESH$ is at most fired once for each name. We define the set of traces in a similar way as for processes.

Definition 7 (Traces).

The set of traces is defined as $\begin{matrix} \begin{matrix} {traces}^{msr} (R) = {[A_{1}, \dots, A_{n}] ∣ \forall 0 ⩽ i ⩽ n . A_{i} \neq \emptyset and \emptyset {\overset{A_{1}}{⟹}}_{R} \dots {\overset{A_{n}}{⟹}}_{R} S_{n} \in {exec}^{msr} (R)}, \end{matrix} \end{matrix}$ where ${\overset{A}{⟹}}_{R}$ is defined as $\overset{\emptyset}{⟶}_{R}^{*} \overset{A}{⟶}_{R} \overset{\emptyset}{⟶}_{R}^{*}$ .

Note that both for processes and multiset rewrite rules the set of traces is a sequence of sets of facts.

5. Security properties

In the tamarin tool [30] security properties are described in an expressive two-sorted first-order logic. The sort $temp$ is used for time points, $V_{temp}$ are the temporal variables.

Definition 8 (Trace formulas).

A trace atom is either false ⊥, a term equality $t_{1} \approx t_{2}$ , a timepoint ordering $i ⋖ j$ , a timepoint equality $i ≐ j$ , or an action $F @ i$ for a fact $F \in F$ and a timepoint i. A trace formula is a first-order formula over trace atoms.

As we will see in our case studies this logic is expressive enough to analyze a variety of security properties, including complex injective correspondence properties.

To define the semantics, let each sort s have a domain $D (s)$ . $D (temp) = Q$ , $D (msg) = M$ , $D (fresh) = FN$ , and $D (pub) = PN$ . A function $θ : V \to M \cup Q$ is a valuation if it respects sorts, i.e., $θ (V_{s}) \subset D (s)$ for all sorts s. If t is a term, $t θ$ is the application of the homomorphic extension of θ to t.

Definition 9 (Satisfaction relation).

The satisfaction relation $(tr, θ) ⊨ φ$ between a trace $tr$ , a valuation θ and a trace formula φ is defined as follows: $\begin{matrix} \begin{matrix} (tr, θ) ⊨ ⊥ & never \\ (tr, θ) ⊨ F @ i & iff & θ (i) \in idx (t r) and F θ \in_{E} {tr}_{θ (i)} \\ (tr, θ) ⊨ i ⋖ j & iff & θ (i) < θ (j) \\ (tr, θ) ⊨ i ≐ j & iff & θ (i) = θ (j) \\ (tr, θ) ⊨ t_{1} \approx t_{2} & iff & t_{1} θ =_{E} t_{2} θ \\ (tr, θ) ⊨ \neg φ & iff & not (tr, θ) ⊨ φ \\ (tr, θ) ⊨ φ_{1} \land φ_{2} & iff & (tr, θ) ⊨ φ_{1} and (tr, θ) ⊨ φ_{2} \\ (tr, θ) ⊨ \exists x : s . φ & iff & there is u \in D (s) such that (tr, θ [x \mapsto u]) ⊨ φ \end{matrix} \end{matrix}$

For readability, we define $t_{1} ⋗ t_{2}$ as $\neg (t_{1} ⋖ t_{2} \lor t_{1} ≐ t_{2})$ and () as expected. We also use classical notational shortcuts such as $t_{1} ⋖ t_{2} ⋖ t_{3}$ for $t_{1} ⋖ t_{2} \land t_{2} ⋖ t_{3}$ and $\forall i ⩽ j . φ$ for $\forall i . i ⩽ j \to φ$ . When φ is a ground formula we sometimes simply write $tr ⊨ φ$ as the satisfaction of φ is independent of the valuation.

Definition 10 (Validity, satisfiability).

Let $Tr \subseteq {(P (G))}^{*}$ be a set of traces. A trace formula φ is said to be valid for $Tr$ , written $Tr ⊨^{\forall} φ$ , if for any trace $tr \in Tr$ and any valuation θ we have that $(tr, θ) ⊨ φ$ .

A trace formula φ is said to be satisfiable for $Tr$ , written $Tr ⊨^{\exists} φ$ , if there exist a trace $tr \in Tr$ and a valuation θ such that $(tr, θ) ⊨ φ$ .

Note that $Tr ⊨^{\forall} φ$ iff $Tr ⊭^{\exists} \neg φ$ . Given a multiset rewriting system R we say that φ is valid, written $R ⊨^{\forall} φ$ , if ${traces}^{msr} (R) ⊨^{\forall} φ$ . We say that φ is satisfied in R, written $R ⊨^{\exists} φ$ , if ${traces}^{msr} (R) ⊨^{\exists} φ$ . Similarly, given a ground process P we say that φ is valid, written $P ⊨^{\forall} φ$ , if ${traces}^{pi} (P) ⊨^{\forall} φ$ , and that φ is satisfied in P, written $P ⊨^{\exists} φ$ , if ${traces}^{pi} (P) ⊨^{\exists} φ$ .

Example.
The following trace formula expresses secrecy of keys generated on the security API, which we introduced in Section 3. $\begin{matrix} \neg (\exists h, k : msg, i, j : temp . NewKey (h, k) @ i \land K (k) @ j) . \end{matrix}$

6. A translation from processes into multiset rewrite rules

In this section we define a translation from a process P into a set of multiset rewrite rules $⟦P⟧$ and a translation on trace formulas such that $P ⊧^{\forall} φ$ if and only if $⟦P⟧ ⊧^{\forall} ⟦φ⟧$ . Note that the result also holds for satisfiability, as an immediate consequence. For a rather expressive subset of trace formulas (see [30] for the exact definition of the fragment), checking whether $⟦P⟧ ⊧^{\forall} ⟦φ⟧$ can then be discharged to the tamarin prover that we use as a backend.

6.1. Definition of the translation of processes

Fig. 6.

The set of rules MD.

To model the adversary’s message deduction capabilities, we introduce the set of rules MD defined in Fig. 6. In order for our translation to be correct, we need to make some assumptions on the set of processes we allow. These assumptions are however, as we will see, rather mild and most of them without loss of generality. First we define a set of reserved variables that will be used in our translation and whose use we therefore forbid in the processes.

Definition 11 (Reserved variables and facts).

The set of reserved variables is defined as the set containing the elements $n_{a}$ for any $a \in FN$ and ${lock}_{l}$ for any $l \in N$ . The set of reserved facts $F_{res}$ is defined as the set containing facts $f (t_{1}, \dots, t_{n})$ where $t_{1}, \dots, t_{n} \in T$ and $f \in {Init, Insert, Delete, IsIn, IsNotSet, state, Lock, Unlock, Out, Fr, In, Msg, ProtoNonce, Event, InEvent, {Pred}_{pr}, Pred_{not}_{pr} ∣ pr \in Σ_{pred}}$ .

For our translation to be sound, we require that for each process, there exists an injective mapping assigning to every unlock t in a process a lock t that precedes it in the process’ syntax tree. Moreover, given a process lock t; P the corresponding unlock in P shall not be under a parallel or replication. These conditions allow us to annotate each corresponding pair lock t, unlock t with a unique label l. The annotated version of a process P is denoted $\overline{P}$ . In case the annotation fails, i.e., P violates one of the above conditions, the process $\overline{P}$ contains ⊥. This is similar to the hypotheses on locks made in StatVerif [3]. They precisely require that:

”In every branch of the syntax tree, every lock must be followed by precisely one corresponding unlock. In $lock; P$ , the part of the process P that occurs before the next unlock, if any, may not include parallel, replication, or lock.”

Unlike StatVerif we do not need to forbid nested locks for our results to hold, even though nested locks are not very useful as they directly lead to deadlocks.

Definition 12 (Process annotation).

Given a ground process P we define the annotated ground process $\overline{P}$ as $ap (P, [])$ where: $\begin{matrix} \begin{matrix} ap (0, A) : = 0 \\ ap (P | Q, A) : = \{\begin{matrix} ap (P, A) | ap (Q, A) & if A = [] \\ ⊥ & otherwise \end{matrix} \\ ap (! P, A) : = \{\begin{matrix} ! ap (P, A) & if A = [] \\ ⊥ & otherwise \end{matrix} \\ ap (if Pred then P else Q, A) : = if Pred then ap (P, A) else ap (Q, A) \\ ap (lookup M as x in P else Q, A) : = lookup M as x in ap (P, A) else ap (Q, A) \\ ap (α; P, A) : = α; ap (P, A) where α \notin {lock t, unlock t : t \in T} \\ ap (lock t; P, A) : = {lock}^{l} t; ap (P, A \cdot (t, l)) where l \in N is a fresh label \\ ap (unlock t; P, A) : = \{\begin{matrix} {unlock}^{l} t; ap (P, A ∖ {(t, l)}) & if \exists i . A_{i} = (t, l) \\ and \forall l^{'}, j < i . A_{j} \neq (t, l^{'}) \\ for A = (A_{0}, \dots, A_{m}) \\ ⊥ & otherwise \end{matrix} \end{matrix} \end{matrix}$

Intuitively, the function

ap (P, A)

makes a traversal of the process P and maintains the list A of pending unlocks. A pair

(l, t)

is in A whenever the instruction

lock t

was encountered, annotated by the label l and no corresponding instruction

unlock t

was found yet. When encountering an

unlock t

instruction we annotate it with the first corresponding label that was added to the list. We choose the first occurrence in the list in order to guarantee that the resulting process is uniquely defined. Remark that the Appendix of [21] contains a different but equivalent formulation of this definition.

Definition 13 (Well-formed).

A ground process P is well-formed if

no reserved variable nor reserved fact appears in P,

any bound name and variable in P cannot be rebound, i.e., if u is bound in P then u is not under the scope of a previous binder, and

$\overline{P}$ does not contain ⊥.

A trace formula φ is well-formed if no reserved variable nor reserved fact appear in φ.

The two first restrictions of well-formed processes can be assumed without loss of generality as processes and formulas can be consistently renamed to avoid reserved variables and α-converted to avoid binding names or variables several times. Also note that the second condition is not necessarily preserved during an execution, e.g. when unfolding a replication, $! P$ and P may bind the same names. We only require this condition to hold on the initial process for our translation to be correct.

The annotation of locks restricts the set of protocols we can translate, but allows us to obtain better verification results, since we can predict which unlock is “supposed” to close a given lock. This additional information is helpful for tamarin’s backward reasoning. We think that our locking mechanism captures all practical use cases. Obviously, locks can be modelled both in tamarin’s multiset rewriting calculus (this is actually what the translation does) and Mödersheim’s set rewriting calculus [26]. However, protocol steps typically consist of a single input, followed by several database lookups, and finally an output. In practice, they tend to be modelled as a single rule, and are therefore atomic. Real implementations are however different, as several entities might be involved, database lookups could be slow, etc. In this case, such simplified models could, e.g., miss race conditions. To the best of our knowledge, StatVerif is the only comparable tool that models locks explicitly and it has stronger restrictions.

Definition 14.
Given a well-formed ground process P we define the labelled multiset rewriting system $⟦P⟧$ as $\begin{matrix} MD \cup {I NIT} \cup ⟦\overline{P}, [], []⟧, \end{matrix}$ where the rule Init is defined as $\begin{matrix} I NIT : [] - [Init ()] \to [{state}_{[]} ()] and \end{matrix}$ $⟦P, p, \tilde{x}⟧$ is defined inductively for process P, position $p \in N^{}$ and sequence of variables $\tilde{x}$ in Fig. 7. For a position p in P we define ${state}_{p}$ to be persistent if $P |_{p} =! Q$ for some process Q; otherwise ${state}_{p}$ is linear.

Fig. 7.
Translation of processes: definition of $⟦P, p, \tilde{x}⟧$ .

In the definition of $⟦P, p, \tilde{x}⟧$ we intuitively use the family of facts ${state}_{p}$ to indicate that the process is currently at position p in its syntax tree. A fact ${state}_{p}$ will indeed be true in an execution of these rules whenever some instance of $P_{p}$ (i.e. the process defined by the subtree at position p of the syntax tree of P) is in the multiset $P$ of the process configuration. The translation of the zero-process, parallel and replication operators merely use ${state}_{p}$ -facts. For instance $⟦P ∣ Q, p, \tilde{x}⟧$ defines the rule $\begin{matrix} [{state}_{p} (\tilde{x})] \to [{state}_{p \cdot 1} (\tilde{x}), {state}_{p \cdot 2} (\tilde{x})] \end{matrix}$ which intuitively states that when a process is at position p (modelled by the fact ${state}_{p} (\tilde{x})$ being true) then the process is allowed to move both to P (putting ${state}_{p \cdot 1} (\tilde{x})$ to true) and Q (putting ${state}_{p \cdot 2} (\tilde{x})$ to true). The translation of $⟦P ∣ Q, p, \tilde{x}⟧$ also contains the set of rules $⟦P, p \cdot 1, \tilde{x}⟧ \cup ⟦Q, p \cdot 2, \tilde{x}⟧$ expressing that after this transition the process may behave as P and Q, i.e., the processes at positions $p \cdot 1$ , respectively $p \cdot 2$ , in the process tree. Also note that the translation of $! P$ results in a persistent fact as $! P$ always remains in $P$ . The translation of the construct $ν a$ translates the name a into a variable $n_{a}$ , as msr rules must not contain fresh names. Any instantiation of this rule will substitute $n_{a}$ by a fresh name, which the $Fr$ -fact in the premise guarantees to be new. This step is annotated with a (reserved) action $ProtoNonce$ . This annotation is merely used in the proof of correctness to distinguish adversary and protocol nonces which is useful as it allows us to identify the restricted names of the process. Note that the fact ${state}_{p \cdot 1}$ in the conclusion carries $n_{a}$ , so that the following protocol steps are bound to the fresh name used to instantiate $n_{a}$ . The first rules of the translation of out and in model the communication between the protocol and the adversary, and vice versa. In the case of out, the adversary must know the channel M, modelled by the fact $In (M)$ in the rule’s premisse, and learns the output message, modelled by the fact $Out (N)$ in the conclusion. In the case of in, the knowledge of the message N is additionally required and the variables of the input message are added to the parameters of the $state$ fact to reflect that these variables are bound. The second and third rules of the translations of out and in model an internal communication, which is synchronous. For this reason, when the second rule of the translation of out is fired, the $state$ -fact is substituted by an intermediate, semi-state* fact, ${state}^{semi}$ , reflecting that the sending process can only execute the next step if the message was successfully received. The fact $Msg (M, N)$ models that a message is present on the synchronous channel. Only with the acknowledgement fact $Ack (M, N)$ , resulting from the second rule of the translation of in, is it possible to advance the execution of the sending process, using the third rule in the translation of out, which transforms the semi-state and the acknowledgement of receipt into ${state}_{p \cdot 1} (\dots)$ . Only now the next step in the execution of the sending process can be executed. The remaining rules essentially update the position in the $state$ facts and add labels. Some of these labels are used to restrict the set of executions. For instance the label $Pred_pr (M_{1}, \dots, M_{k})$ will be used to indicate that we only consider executions in which $ϕ_{pr}$ holds for $M_{1}, \dots, M_{k}$ . As we will see in the next section these restrictions will be encoded in the trace formula.

Fig. 8.
The set of multiset rewrite rules $⟦! P_{new}⟧$ (omitting the rules in MD).
Example.
Figure 8 illustrates the above translation by presenting the set of msr rules $⟦! P_{new}⟧$ (omitting the rules in MD already shown in Fig. 6).

A graph representation of an example trace, similar to the one generated by the tamarin tool, is depicted in Fig. 9. Every node stands for the application of a multiset rewrite rule, where the premises are at the top, the conclusions at the bottom, and the actions (if any) annotate the node. Every premise needs to have a matching conclusion, visualized by the arrows, to ensure the graph depicts a valid msr execution. (This is a simplification of the dependency graph representation tamarin uses to perform backward-induction [30,31].) We also note that in the current example $! {state}_{[]} ()$ is persistent and can therefore be used multiple times as a premise. As $Fr ()$ facts are generated by the $F RESH$ rule which has an empty premise and action, we omit instances of Fresh and leave those premises, but only those, disconnected.

Fig. 9.
Example trace for the translation of $! P_{new}$ .

Remark 1.
One may note that, while for all other operators, the translation produces well-formed multiset rewriting rules (as long as the process is well-formed itself), this is not the case for the translation of the lookup operator, i.e., it violates the well-formedness condition from Definition 4. Tamarin’s constraint solving algorithm requires all rules, with the exception of Fresh, to be well-formed. We show however that, under these specific conditions, the solution procedure is still correct. See Appendix A for the proof.
6.2. Definition of the translation of trace formulas

We can now define the translation for formulas.

Definition 15.
Given a well-formed trace formula φ we define $\begin{matrix} {⟦φ⟧}_{\forall} : = α \Rightarrow φ and {⟦φ⟧}_{\exists} : = α \land φ, \end{matrix}$ where α is defined in Fig. 10.

Fig. 10.
Definition of α.

The formula α uses the actions of the generated rules to filter out executions that we wish to discard:
$α_{init}$ ensures that the init rule is only fired once.

$α_{pred}$ ensures that we only consider traces where for all positive and negative branches in conditionals the corresponding predicate formula, respectively its negation, hold.

$α_{in}$ and $α_{notin}$ ensure that a successful lookup was preceded by an insert that was neither revoked nor overwritten while an unsuccessful lookup was either never inserted, or deleted and never re-inserted.

$α_{lock}$ checks that between each two matching locks there must be an unlock. Furthermore, between the first of these locks and the corresponding unlock, there is neither a lock nor an unlock.

$α_{inev}$ ensures that whenever an instance of MDIn is required to generate an $In$ -fact, it is generated as late as possible, i.e., there is no visible event between the action $K (t)$ produced by MDIn, and a rule that requires $In (t)$ .
We also note that $Tr ⊨^{\forall} {⟦φ⟧}_{\forall}$ iff $Tr ⊭^{\exists} {⟦\neg φ⟧}_{\exists}$ .
6.3. Discussion of design choices

There exist certainly other ways of correctly translating our calculus into msr rules. Most of our choices were guided by the way tamarin internally works. To better appreciate our choices we will give a high-level overview of the procedure implemented in tamarin. A detailed review of the procedure is however out of scope of this paper and we refer the reader to [30] for a detailed description.

A short overview of tamarin. Tamarin basically applies a backward reasoning approach to try to find a trace which satisfies a given formula. (Validity claims are first translated to satisfiability claims.) This is reminiscent to the reasoning when proving protocol correctness in the strand space model [20]. More precisely, rather than reasoning about traces, tamarin reasons about dependency graphs, an enriched representation of traces. Dependency graphs are DAGs, where each node corresponds to a ground instance of an msr rule and the edges represent the causal dependencies among these rules. For every premise of a rule there is an incoming edge from another rule with a conclusion that matches the premise. Moreover, linear facts may have at most one outgoing edge and fresh rules are unique. Every topological ordering then corresponds to a trace.

Tamarin’s backward search is formalised by a constrained solving algorithm. The solutions of a constraint system are the dependency graphs whose traces satisfy the constraints. The initial constraint system is simply the formula to be satisfied. The procedure then applies simplification rules which preserve all solutions. If the constraint system reaches ⊥ the formula is unsatisfiable. In case no more rules can be applied the system is solved, and the dependency graphs that are the solutions of the constraint system can be directly constructed.

Slightly simplifying, a typical rule in the constraint solving algorithm would state that if the formula is of the form $a @ i$ then the dependency graph must contain a node corresponding to a rule $ℓ \overset{b}{\to} r$ with an action b that matches a. Next, it will try to solve each premise in ℓ by adding a constraint that this rule must be preceded by a node corresponding to rules with a fact in its conclusion matching this premise. Another example of a simplification rule is the following, which reasons about the uniqueness of fresh names: when the constraint system contains both $Fr (n) @ i$ and $Fr (n) @ j$ it concludes that $i ≐ j$ .

The constraint simplification procedure may of course enter a loop and not terminate. This is natural given that the underlying problem is undecidable. The algorithm can nevertheless be guided by heuristics to avoid some of these loops and use previously proven lemmas and axioms to prune otherwise infinite branches.

Design choices. The axioms in the translation of the formula are designed to work hand in hand with the translation of the process into rules. They express the correctness of traces with respect to our calculus’ semantics, but are also meant to guide tamarin’s constraint solving algorithm. The use of axioms, rather than other possible encodings, often helps the algorithm to enforce termination as they can be used to cut branches that are not consistent with the axioms. We will discuss the axioms related to state manipulation.

Let us first consider the axioms related to lock actions. A naïve axiomatization would postulate that “every lock is preceded by an unlock and no lock or unlock in between, unless it is the first lock.” This would however cause tamarin to loop, as we will see below. We will first illustrate how the axiom $α_{lock}$ avoids this caveat because it only applies to pairs of locks carrying the same annotations.

Consider the constraint solving procedure for the following process

and the trace formula $\forall i, j . Visit () @ i \land Visit () @ j ⟹ i ≐ j$ . The msr rules generated by our translation are depicted in Fig. 11.

Fig. 11.

Translation of process P.

Tamarin shows validity of the trace formula by showing that its negation $\exists i, j . Visit () @ i \land Visit () @ j \land (i ⋖ j \lor j ⋖ i)$ is not satisfiable. Two symmetrical constraint systems need to be refuted, we focus on the one pictured in Fig. 12, i.e., the case where $i ⋖ j$ .

Fig. 12.

Constraint system resulting from the negation of $\forall i, j . Visit () @ i \land Visit () @ j ⟹ i ≐ j$ .

Fig. 13.

All $state$ -premises have exactly one matching conclusion and are resolved up to a (unique) instance of Init.

As all $state$ -premises have exactly one rule with a matching conclusion, there are two chains of rule instances from i and j up to the Init rule, which is unique by $α_{init}$ . Both are recovered in this step, see Fig. 13. As tamarin pre-computes chains of rule instantiations whose open premises can be uniquely resolved, this is done in two steps, one for each chain.

Fig. 14.

By $α_{lock}$ , there exists node Unlock( $l_{1}$ ,‘s’) at position $t_{2}$ such that $t_{1} ⋖ t_{2} ⋖ t_{3}$ without any matching lock or unlock for ‘s’ between $t_{1}$ and $t_{2}$ .

Now $α_{lock}$ is applied, which adds the constraint that the first lock needs to have a matching unlock, i.e., a node Unlock( $l_{1}$ ,‘s’) has to appear at some position $t_{2}$ between positions $t_{1}$ and $t_{3}$ as sketched in Fig. 14. More precisely, we require the existence of an unlock for $‘s’$ annotated with $l_{1}$ , and no lock or unlock for ‘s’ in between. The axiom itself contains only one case, so the only case distinction that takes place is over which rule produces the matching $Unlock$ -action. Due to the annotation, however, all but one are refuted immediately in the next step, as two nodes containing the same fact $Fr (l_{1})$ in the premise are unified immediately.

Fig. 15.

$State$ -premise at position $t_{2}$ can be resolved up to Init. Same fresh value $l_{1}$ is generated at positions $t_{1}$ and $t_{2}$ .

Due to the annotation, the fact ${state}_{11211} (l_{1})$ contains the same fresh name $l_{1}$ that instantiates the annotation variable in Unlock( $l_{1}$ ,‘s’) at $t_{1}$ . Every fact ${state}_{p^{'}} (\dots)$ for some position $p^{'}$ that is a prefix of p and a suffix of the position of the corresponding lock contains this fresh name. Furthermore, every rule instantiation that is an ancestor of a node in the dependency graph corresponds to the execution of a command that is an ancestor in the process tree. Therefore, the backward search eventually reaches the matching lock, including the annotation, which is determined to be $l_{1}$ , and hence appears in the $Fr$ -premise (Fig. 15).

Because of the common premise $Fr (l_{1})$ , both subgraphs are merged. The result is a sequence of nodes from the first lock to the corresponding unlock, and graph constraints restricting the second lock to not take place between the first lock and the unlock. We note that the axiom $α_{lock}$ is only instantiated once per pair of locks, since it requires that $i ⋖ j$ , thereby fixing their order.

If we would not annotate locks with fresh names, these two subgraphs would not be merged, as they could be different. In fact, the axion $α_{lock}$ would apply again, e.g., for Lock( $l_{1}$ ,‘s’) (or rather Lock(‘s’)) at $t_{1}$ and the newly created rule instantiation with the same action. We would thus run in a loop.

Fig. 16.

Because of the identical premise $Fr (l_{1})$ in both chains leading to $t_{i}$ and $t_{2}$ , and as all $state$ -facts below position $[1]$ are linear, both subgraphs are merged.

We have achieved a total ordering on all rule instantiations that appear in the constrain system. Now $α_{notin}$ can be applied for the rule instantiation at k as pictured in Fig. 16. Since $t_{2} ⋖ t_{3}$ , it holds that $i^{'} ⋖ k$ and thus the first case can be refuted. The second case is also refuted right away, as there is no rule with action Delete in the translation of P.

In contrast, consider now the naïve formulation of $α_{lock}$ (“every lock is preceded by an unlock and no lock or unlock in between, unless it is the first lock”): $\begin{matrix} \begin{matrix} α_{lock}^{'} = \forall t_{1}, l, s . Lock (l, s) @ t_{1} ⟹ & (\exists t_{0}, l^{'} . & Unlock (l^{'}, s) @ t_{0} \land t_{0} ⋖ t_{1} \\ \land (\forall t_{i}, l_{i} . Lock (l_{i}, s) @ t_{i} \Rightarrow (t_{i} ⋖ t_{0}) \lor (t_{1} ⋖ t_{i})) \\ \land (\forall t_{i}, l_{i} . Unlock (l_{i}, s) @ t_{i} \Rightarrow (t_{i} ⋖ t_{0}) \lor (t_{1} ⋖ t_{i}))) \\ \lor (\forall t_{i}, l_{i} . Lock (l_{i}, s) @ t_{i} \Rightarrow t_{0} ⋖ t_{i}) \end{matrix} \end{matrix}$

Even if annotations are employed, this would easily provoke a loop: applied after the second step, to the Lock-node at $t_{3}$ (see Fig. 13), the first case would require a node Unlock( $l^{'}$ ,‘s’) at position $t_{0}$ with $t_{0} ⋖ t_{3}$ . Similar to the second step, a chain of rule instances from this node to the unique instantiation of the Init rule would be created in one step, pictured in Fig. 17. Observe that the rule instantiation at position $t_{0}^{'}$ has an action Lock( $l^{'}$ ,‘s’). As $l^{'}$ is not necessarily equal to $l_{1}$ or $l_{2}$ , this chain of rule instantiations cannot be merged with any other subgraphs. Hence the Lock-action at position $t_{0}^{'}$ needs is considered to be new, and thus $α_{lock}^{'}$ can be applied again, resulting in a loop. This loop is triggered whenever an action Lock(l,‘s’) appears.

Fig. 17.

The naïve formulation $α_{lock}^{'}$ provokes a loop: $t_{1}$ and $t_{0}^{'}$ are possibly distinct, thus $α_{lock}^{'}$ applies to Lock( $l'$ ,‘s’) at $t_{0}^{'}$ .

In summary, a careful formulation of this axiom was necessary to avoid loops. The annotation helps distinguishing which unlock is expected between two locks, vastly improving the speed of the backward search. This optimisation, however, required us to put restrictions on the locks. The axiom is formulated in a way that links the lock with the corresponding unlock by means of this annotation. The equivalence between $α_{lock}$ and the naïve formulation is non-trivial, but shown in the proof of Lemma 10 in Appendix B.

Similarly, the axioms $α_{in}$ and $α_{notin}$ are designed to work well with tamarin’s constraint solving algorithm: when a constraint with the action IsIn is created, by definition of the translation, this corresponds to a lookup command. The existential in $α_{in}$ translates into a graph constraint that postulates the existence of an insert node for the value fetched by the lookup, and three formulas assuring that (i) this insert node appears before the lookup, (ii) is uniquely defined, i.e., it is the last insert to the corresponding key, and (iii) there is no delete in between. Due to these conditions, $α_{notin}$ only adds one Insert node per IsIn node – the case where an axiom postulates a node, which itself allows for postulating yet another node needs to be avoided, as tamarin runs into loops otherwise. The technique of enforcing correctness of the translation through rewriting the formula via these axioms additionally allows us to convey information on the nature of our rules resulting from the translation to the constraint solving algorithm.

6.4. Correctness of the translation

The correctness of our translation is stated by the following theorem.

Theorem 1.
Given a well-formed ground process P and a well-formed trace formula φ we have that $\begin{matrix} {traces}^{pi} (P) ⊨^{⋆} φ iff {traces}^{msr} (⟦P⟧) ⊨^{⋆} {⟦φ⟧}_{⋆} \end{matrix}$ where ⋆ is either ∀ or ∃.

We here give an overview of the main propositions and lemmas needed to prove Theorem 1. To show the result we need two additional definitions. We first define an operation that allows to restrict a set of traces to those that satisfy the trace formula α as defined in Definition 15.
Definition 16.
Let α be the trace formula as defined in Definition 15 and $Tr$ a set of traces. We define $\begin{matrix} filter (Tr) : = {tr \in Tr ∣ \forall θ . (tr, θ) ⊨ α} . \end{matrix}$

The following proposition states that if a set of traces satisfies the translated formula then the filtered traces satisfy the original formula.
Proposition 1.
Let $Tr$ be a set of traces and φ a trace formula. We have that $\begin{matrix} Tr ⊨^{⋆} {⟦φ⟧}_{⋆} iff filter (Tr) ⊨^{⋆} φ, \end{matrix}$ where ⋆ is either ∀ or ∃.
Proof.
We first show the two directions for the case $⋆ = \forall$ . We start by showing that $Tr ⊨^{\forall} ⟦φ⟧$ implies $filter (Tr) ⊨ φ$ . $\begin{array}{l} (since filter (Tr) \subseteq Tr) & Tr ⊨^{\forall} {⟦φ⟧}_{\forall} & \Rightarrow filter (Tr) ⊨^{\forall} {⟦φ⟧}_{\forall} \\ (by definition of {⟦φ⟧}_{\forall}) & \Leftrightarrow filter (Tr) ⊨^{\forall} α \Rightarrow φ \\ (since filter (Tr) ⊨^{\forall} α) & \Leftrightarrow filter (Tr) ⊨^{\forall} φ \end{array}$ We next show that $filter (Tr) ⊨^{\forall} φ$ implies $Tr ⊨^{\forall} {⟦φ⟧}_{\forall}$ . $\begin{array}{l} (since filter (Tr) ⊨^{\forall} α) & filter (Tr) ⊨^{\forall} φ & \Rightarrow filter (Tr) ⊨^{\forall} α \land φ \\ (since filter (Tr) \subseteq Tr and (Tr ∖ filter (Tr)) ⊭^{\forall} α) & \Leftrightarrow Tr ⊨^{\forall} \neg α \lor (α \land φ) \\ \Leftrightarrow Tr ⊨^{\forall} α \Rightarrow φ \\ (by definition of {⟦φ⟧}_{\forall}) & \Leftrightarrow Tr ⊨^{\forall} {⟦φ⟧}_{\forall} \end{array}$ The case of $⋆ = \exists$ now easily follows: $\begin{matrix} Tr ⊨^{\exists} {⟦φ⟧}_{\exists} iff Tr ⊭^{\forall} {⟦\neg φ⟧}_{\forall} iff filter (Tr) ⊭^{\forall} \neg φ iff filter (Tr) ⊨^{\exists} φ . \end{matrix}$ □

Next we define the hiding operation which removes all reserved facts from a trace.
Definition 17 ( $hide$ ).

Given a trace $tr$ and a set of facts F we inductively define $hide ([]) = []$ and $\begin{matrix} hide (F \cdot tr) : = \{\begin{matrix} hide (tr) & if F \subseteq F_{res}, \\ (F ∖ F_{res}) \cdot hide (tr) & otherwise . \end{matrix} \end{matrix}$ Given a set of traces $Tr$ we define $hide (Tr) = {hide (t) ∣ t \in Tr}$ .

As expected well-formed formulas that do not contain reserved facts evaluate the same whether reserved facts are hidden or not.

Proposition 2.
Let $Tr$ be a set of traces and φ a well-formed trace formula. We have that $\begin{matrix} Tr ⊨^{⋆} φ iff hide (Tr) ⊨^{⋆} φ, \end{matrix}$ where ⋆ is either ∀ or ∃.
Proof.
We start with the case $⋆ = \exists$ and show the stronger statement that for a trace $tr$ $\begin{matrix} \forall θ . \exists θ^{'} . if (tr, θ) ⊨ φ then (hide (tr), θ^{'}) ⊨ φ \end{matrix}$ and $\begin{matrix} \forall θ . \exists θ^{'} . if (hide (tr), θ) ⊨ φ then (tr, θ^{'}) ⊨ φ . \end{matrix}$ We will show both statements by a nested induction on $| tr |$ and the structure of the formula. (The underlying well-founded order is the lexicographic ordering of the pairs consisting of the length of the trace and the size of the formula.)

If $| tr | = 0$ then $tr = []$ and $tr = hide (tr)$ which allows us to directly conclude letting $θ^{'} : = θ$ .

If $| tr | = n$ , we define $\overline{tr}$ and F such that $tr = \overline{tr} \cdot F$ . By induction hypothesis we have that $\begin{matrix} \forall \overline{θ} . \exists {\overline{θ}}^{'} . if (\overline{tr}, \overline{θ}) ⊨ φ then (hide (\overline{tr}), {\overline{θ}}^{'}) ⊨ φ \end{matrix}$ and $\begin{matrix} \forall \overline{θ} . \exists {\overline{θ}}^{'} . if (hide (\overline{tr}), \overline{θ}) ⊨ φ then (\overline{tr}, {\overline{θ}}^{'}) ⊨ φ . \end{matrix}$ We proceed by structural induction on φ.
$φ = ⊥$ , $φ = i ⋖ j$ , $φ = i ≐ j$ or $t_{1} ≐ t_{2}$ . In these cases we trivially conclude as the truth value of these formulas does not depend on the trace and for both statements we simply let $θ^{'} : = θ$ .

$φ = f @ i$ . We start with the first statement. Suppose that $(tr, θ) ⊨ f @ i$ . If $θ (i) < n$ then we have also that $\overline{tr}, θ ⊨ f @ i$ . By induction hypothesis, there exists ${\overline{θ}}^{'}$ such that $(\overline{tr}, {\overline{θ}}^{'}) ⊨ f @ i$ . Hence we also have that $(tr, {\overline{θ}}^{'}) ⊨ f @ i$ and letting $θ^{'} : = {\overline{θ}}^{'}$ allows us to conclude. If $θ (i) = n$ we know that $f \in t r_{n}$ . As φ is well-formed $f \notin F_{res}$ and hence $f \in hide {(tr)}_{n^{'}}$ where $n^{'} = | hide (tr) |$ . The proof of the other statement is similar.

$φ = \neg φ^{'}$ , $φ = φ_{1} \land φ_{2}$ , or $φ = \exists x : s . φ^{'}$ . We directly conclude by induction hypotheses (on the structure of φ).
From the above statements we easily have that $Tr ⊨^{\exists} φ$ iff $hide (Tr) ⊨^{\exists} φ$ . The case of $⋆ = \forall$ now easily follows: $\begin{matrix} Tr ⊨^{\forall} φ iff Tr ⊭^{\exists} \neg φ iff hide (Tr) ⊭^{\exists} \neg φ iff hide (Tr) ⊨^{\forall} φ . \end{matrix}$ □

We can now state our main lemma which is relating the set of traces of a process P and the set of traces of its translation into multiset rewrite rules.
Lemma 1.
Let P be a well-formed ground process. We have that $\begin{matrix} {traces}^{pi} (P) = hide (filter ({traces}^{msr} (⟦P⟧))) . \end{matrix}$

The proof is given in Appendix B. Our main theorem can now be proven by applying Lemma 1, Proposition 2 and Proposition 1. Proof of Theorem 1.
$\begin{array}{l} (by Lemma 1 ) & {traces}^{pi} (P) ⊨^{⋆} φ & \Leftrightarrow hide (filter ({traces}^{msr} (⟦P⟧))) ⊨^{⋆} φ \\ (by Proposition 2 ) & \Leftrightarrow filter ({traces}^{msr} (⟦P⟧)) ⊨^{⋆} φ \\ (by Proposition 1 ) & \Leftrightarrow {traces}^{msr} (⟦P⟧) ⊨^{⋆} {⟦φ⟧}_{⋆} \end{array}$

□

7. Case studies and dedicated heuristics

In the following we will briefly overview some case studies we performed. These case studies include a simple security API similar to PKCS#11 [27], the Yubikey security token, the optimistic contract signing protocol by Garay, Jakobsson and MacKenzie (GJM) [16] and a few other examples discussed in Arapinis et al. [3] and Mödersheim [26]. We do not detail all the formal models of the protocols and properties that we studied, and sometimes present slightly simplified versions. All files of our prototype implementation and our case studies are available at http://sapic.gforge.inria.fr/

In addition to the syntax of the calculus described in Section 3 our tool also allows the user to fall back to labelled msr rules inside of processes. The treatment of this extension is described in the conference version [21]. Having an access to the underlying formalism may sometimes be convenient, but as we do not use it in the examples described in this paper we chose to omit this feature to clarify the presentation.

Related work complements these case studies with an analysis of a more complete model of PKCS#11 [22], and the enhanced authorisation mechanism in the TPM 2.0 [33], as well as an extension of SAPIC that allows for the analysis of stream protocols such as TESLA [19].

We will also discuss a dedicated heuristics we developed that favours termination of tamarin on msr systems produced by our tool. The results are summarized in Table 1. For each case study we provide the number of typing lemmas that were needed by the tamarin prover and whether manual guidance of the tool was required. In case no manual guidance is required we also give execution times.

Table 1
Case studies

Example Typing lemmas Automated run (w/o heuristics) * Automated run (w/ heuristics) *

Security API à la PKCS#11 4 no yes (2.1 s)

Needham–Schroeder–Lowe [25] 1 yes (1.4 s) yes (17.7 s)

Yubikey protocol [23,34] 5 no no

GJM protocol [3,16] 0 yes (11.5 s) yes (9.9 s)

Mödersheim’s example [26] 0 no yes (0.8 s)

Security device [3] 1 yes (3.5 s) yes (8.7 s)

Example	Typing lemmas	Automated run (w/o heuristics) *	Automated run (w/ heuristics) *
Security API à la PKCS#11	4	no	yes (2.1 s)
Needham–Schroeder–Lowe [25]	1	yes (1.4 s)	yes (17.7 s)
Yubikey protocol [23,34]	5	no	no
GJM protocol [3,16]	0	yes (11.5 s)	yes (9.9 s)
Mödersheim’s example [26]	0	no	yes (0.8 s)
Security device [3]	1	yes (3.5 s)	yes (8.7 s)

Running times on Intel i7-4770 CPU 3.40 GHz (8 Cores) and 8 GB RAM.

7.1. Security API à la PKCS#11

This example illustrates how our modelling might be useful for the analysis of Security APIs in the style of the PKCS#11 standard [27]. Indeed, Künnemann [22] used our tool to perform an automated analysis of PKCS#11 v2.20. In addition to the processes presented in the running example in Section 3 the actual case study models the following two operations: (i) encryption: given a handle and a plain-text, the user can request an encryption under the key the handle points to. (ii) unwrap: given a ciphertext $senc (k_{2}, k_{1})$ , and a handle $h_{1}$ , the user can request the ciphertext to be unwrapped, i.e. decrypted, under the key pointed to by $h_{1}$ . If decryption is successful, the result is stored on the device, and a handle pointing to $k_{2}$ is returned. Moreover, contrary to the running example, at creation time keys are assigned the attribute ‘init’, from which they can move to either ‘wrap’, or ‘unwrap’. Furthermore, the database maps handles to pairs of keys and attributes. See the following snippet:

Note that, in contrast to the running example, it is necessary to encapsulate the state changes between lock and unlock. Otherwise an adversary can stop the execution after line 3, set the attribute to ‘wrap’ in a concurrent process and produce a wrapping. After resuming operation at line 4, he can set the key’s attribute to ‘dec’, even though the attribute is set to ‘wrap’. Hence, the attacker is allowed to decrypt the wrapping he has produced and can obtain the key. Such subtleties can produce attacks that our modeling allows to detect. If locking is handled correctly, we show secrecy of keys produced on the device, proving the property introduced in Example 5. If locks are removed the attack described before is found. The conference version [21] mistakenly reported that the verification of this example was fully automated, but the verified model contained a typo, where $P_{set_wrap}$ wrote to $⟨ attr, h ⟩$ rather than $⟨ att, h ⟩$ , effectively disabling unwrapping altogether. Using the new heuristics, it is again possible to verify this example automatically.

7.2. Yubikey

The Yubikey [34] is a small hardware device designed to authenticate a user against network-based services. Manufactured by Yubico, a Swedish company, the Yubikey itself is a low cost ($25), thumb-sized USB device. In its typical configuration, it generates one-time passwords based on encryptions of a secret value, a running counter and some random values using a unique AES-128 key contained in the device. The Yubikey authentication server accepts a one-time password only if it decrypts under the correct AES key to a valid secret value containing a counter larger than the last counter accepted. The counter is thus a means to prevent replay attacks. To date, over a million Yubikeys have been shipped to more than 50,000 customers including governments, universities and enterprises, e.g. Google, Microsoft and Facebook [36].

The following process $P_{Yubikey}$ models a single Yubikey, as well as its initial configuration, where an entry in the server’s database for the public id $pid$ is created. This entry contains a tuple consisting of the Yubikey’s secret id, AES key, and an initial counter value.

Here, the processes $! P_{Plugin}$ and $! P_{ButtonPress}$ model the Yubikey being unplugged and plugged in again (possibly on a different computer), and the emission of the one-time password. We will only discuss $P_{ButtonPress}$ here. When the user presses the button on the Yubikey, the device outputs a one-time password consisting of a counter $t c$ , the secret id $secretid$ and additional randomness $npr$ encrypted using the AES key k. For readability, we leave out events that are only used in helping lemmas as well as message input from the adversary that is included in the model to force him to provide the next counter (which he always can, as it is public).

The one-time password $senc (⟨ secretid, t c, npr ⟩, k)$ can be used to authenticate against a server that shares the same secret key, which we model in the process $P_{Server}$ . The process receives the encrypted one-time password along with the public id $pid$ of a Yubikey and a $nonce$ that is part of the protocol, but is irrelevant for the authentication of the Yubikey on the server. The server then looks up the secret id and the AES key associated to the public id, as well as the last recorded counter value $otc$ . If the key and secret id used in the request match the values retrieved from the database, then the event $Login (pid, k, tc)$ is logged, marking a successful login of the Yubikey $pid$ with key k for the counter value $tc$ . Afterwards, the old tuple $⟨ secretid, k, otc ⟩$ is replaced by $⟨ secretid, k, tc ⟩$ , to update the latest counter value received.

Note that, in our modelling, the server keeps one lock per public id, which means that it is possible to have several active instances of the server thread in parallel as long as all requests concern different Yubikeys.

We model the counter as a multiset only consisting of the symbols “one” and “zero”. The multiplicity of ‘one’ in the multiset is the value of the counter. A counter value is considered smaller than another one, if the first multiset is included in the second, therefore $\begin{matrix} ϕ_{smaller} (x_{1}, x_{2}) : = \exists z . x_{1} + z = x_{2} . \end{matrix}$

The process we analyse models a single authentication server (that may run arbitrarily many threads) and an arbitrary number of Yubikeys, i.e., $P_{Server} ∣! P_{Yubikey}$ . Among other properties, we show by the means of an injective correspondence property that an attacker that controls the network cannot perform replay attacks, and that each successful login was preceded by a user “pressing the button”, formally: $\begin{array}{l} \forall & pid, k, x, t_{2} . Login (pid, k, x) @ t_{2} \Rightarrow \\ \exists sid, t_{1} . YubiPress (pid, sid, k, x) @ t_{1} \land t_{1} ⋖ t_{2} \land \forall t_{3} . Login (pid, k, x) @ t_{3} \Rightarrow t_{3} ≐ t_{2} . \end{array}$ Besides injective correspondence, we show the absence of replay attacks and the property that a successful login invalidates previously emitted one-time passwords. All three properties follow more or less directly from a stronger invariant, which itself can be proven in 516 steps. To find theses steps, tamarin needs some additional human guidance (17 steps), which can be provided using the interactive mode. This mode still allows the user to complement his manual efforts with automated backward search. The example files contain the modelling in our calculus, the complete proof, and the manual part of the proof which can be verified by tamarin without interaction.

Our analysis makes three simplifications: First, in $P_{Server}$ , we use pattern matching instead of decryption as demonstrated in the process $P_{dec}$ we introduced in Section 3. Second, we omit the CRC checksum and the time-stamp that are part of the one-time password in the actual protocol, since they do not add to the security of the protocol in the symbolic setting. Third, the Yubikey has actually two counters instead of one, a session counter, and a token counter. We treat the session and token counter on the Yubikey as a single value, which we justify by the fact that the Yubikey either increases the session counter and resets the token counter, or increases only the token counter, thereby implementing a complete lexicographical order on the pair $(session counter, token counter)$ .

A similar analysis has already been performed by Künnemann and Steel, using tamarin’s multiset rewriting calculus [23]. However, the model in our new calculus is more fine-grained and we believe more readable. Security-relevant operations like locking and tests on state are written out in detail, resulting in a model that is closer to the real-life operation of such a device. The modeling of the Yubikey takes approximately 38 lines in our calculus, which translates to 49 multiset rewrite rules. The model of [23] contains only four rules, but they are quite complicated, resulting in 23 lines of code. More importantly, the gap between their model and the actual Yubikey protocol is larger – in our calculus, it becomes clear that the server can treat multiple authentication requests in parallel, as long as they do not claim to stem from the same Yubikey. An implementation on the basis of the model from Künnemann and Steel would need to implement a global lock accessible to the authentication server and all Yubikeys. This is however unrealistic, since the Yubikeys may be used at different places around the world, making it unlikely that there exist means of direct communication between them. While a server-side global lock might be conceivable (albeit impractical for performance reasons), a real global lock could not be implemented for the Yubikey as deployed.

7.3. The GJM contract signing protocol [16]

A contract signing protocol allows two parties to sign a contract in a fair way: none of the participants should be bound to the contract without the other participant being bound as well. A straightforward solution is to use a trusted party that collects both signatures on the contract and then sends the signed contracts to each of the participants. Optimistic protocols have been designed to avoid the use of a trusted party whenever possible (optimizing efficiency, and avoiding the potential cost of a trusted party). In these protocols the parties first try to simply exchange the signed contracts; in case of failure, or cheating behavior of one of the parties, the trusted party can be contacted. Depending on the situation, the trusted party may either abort the contract, or resolve it. In case of an abort decision the protocol ensures that none of the parties obtains a signed contract, while in case of a resolve the protocol ensures that both participants obtain the signed contract. For this the trusted party needs to maintain a database with the current status of all contracts (aborted, resolved, or no decision has been taken). In our calculus the status information is naturally modelled using our insert and lookup constructs. The use of locks is also crucial here to avoid the status to be changed between a lookup and an insert.

This protocol was also studied by Arapinis et al. [3]. They showed the crucial property that a same contract can never be both aborted and resolved. However, due to the fact that StatVerif only supports a finite number of memory cells, they have shown this property for a single contract and provide a manual proof to lift the result to an unbounded number of contracts. We directly prove this property for an unbounded number of contracts.

7.4. Further case studies

We investigated the case study presented by Mödersheim [26], a key-server example, as well as a simple security device which served as an example for StatVerif [3]: the device is initialized once, either to left or right. Later on, it accepts pairs of encryptions and decrypts either the left component of the pair or the right component, but not both. As the input language of StatVerif is very similar to ours their model could be easily adapted to our tool. In fact, we were able to remove the restriction to a single security device. Finally, we also illustrate the tool’s ability to analyze classical security protocols by analyzing the Needham Schroeder Lowe protocol [25].

7.5. Heuristics

In order to improve our results on the case studies presented in the conference version [21], we have altered the heuristics of the tamarin-prover. We make use of the a priori knowledge that the msr system is an output of our translation. These heuristics can be switched on using the command line switch --heuristic=p and alter the ranking of goals which is used to determine the next step in an automatic proof. The heuristics have no bearing on the correctness of tamarin, but often improve automation of the verification procedure, as our case studies show (see Table 1). These heuristics also allowed an automated proof of the PKCS#11 case study [22].

The main goal is to avoid a loop in the resolution procedure, so our approach is conservative in that we only prioritize goals that do not cause other prioritized goals to appear, unless the protocol has been annotated to do that. The heuristic alters tamarin’s standard “smart” heuristic in the following way: state-facts are resolved right away. As state-goals can only be solved by exactly one rule (except for message transmission), and state predicates in the premise of a rule are indexed with a position that is a prefix of the position of the state predicate in the conclusion, loops are impossible and case distinctions rare. Moreover, tamarin precomputes chains and is hence often able to resolve the chain until ${state}_{0}$ in one step. Goals for $Unlock$ -actions are solved right away. As these goals are produced by $α_{lock}$ , they identify the correct unlock using the annotations introduced in Definition 12. By reformulating $α_{lock}$ (compared to the conference article), we were able to avoid the repeated application of this rule. We removed the prioritisation of goals for adversarial deduction of fresh values, as it is counter-productive in the case of handles. They are fresh values that can usually be derived from protocol output, so a case distinction on all possible ways of deriving them is sometimes misleading. Another addition prioritizes goals for $Insert$ -actions when the first element of the key is prefixed “F_”, so the user can prioritize the reasoning on lookups to keys like $⟨^{'} {F_database}^{'}, p ⟩$ . Adversarial deduction for fresh values can be prioritized in the same way, using “L_” instead of “F_” achieves deprioritisation.

7.6. Proof effort

A comparison between the effort needed to derive a proof for a protocol in our calculus and a protocol modelled via multiset rewrite rules is only sound when both model the same thing. Whenever the direct encoding is simplified, e.g., in the Yubikey model, the proof is obviously simpler, but on the other hand, as we have already discussed in Section 7.2, it may be oversimplified. Whenever models were relatively close, our experiments suggested that the same kind of lemmas are needed. In particular for the GJM contract signing protocol, the simple security device and the Needham–Schroeder–Lowe protocol, the lemmas were literally the same. This suggests that these helping lemmas prove properties beyond the level of representation, i.e., properties of the protocol itself.

Our dedicated heuristics discussed in the previous section also improve termination. One may note that tamarin also includes several heuristics that can be chosen from and combined in several ways to help termination. Some of the case studies, e.g., the group protocols analysed in [32], also required the development of dedicated heuristics. Our heuristics benefit from the fact that the msr rules are generated and, therefore, are more restricted than the arbitrary msr rules that may be given to tamarin using a direct msr rule modelling.

When these heuristics fail, or the user wishes to inspect the proof, tamarin’s interactive mode allow manual inspection and selection of the proof goals that are chosen at each step. To make use of this, in addition to the working of the tamarin interactive mode, a basic understanding of our translation (but not of the correctness proof) is necessary. A tight integration of SAPIC into tamarin would surely aid in this regard, but requires significant engineering effort. Such an integration could additionally provide information given by the process description. Relations between locks, lookup and inserts could be highlighted and protocol roles (often defined as abbreviations by protocol designers) distinguished.

For protocols which have complicated control flow or structure (e.g. group protocols [32]), a direct encoding may actually be better suited. We provide a mechanism for embedding labelled msr rules directly inside processes (described in the conference version [21]), which may useful is some circumstances, and such a mixed model might sometimes give the user “the best of the two approaches”.

8. Conclusion

We present a process calculus which extends the applied pi calculus with constructs for accessing a global, shared memory together with an encoding of this calculus in labelled msr rules which enables automated verification using the tamarin prover as a backend. Our prototype verification tool, automating this translation, has been successfully used to analyze several case studies. As future work we plan to increase the degree of automation of the tool by automatically generating helping lemmas. To achieve this goal we can exploit the fact that we generate the msr rules, and hence control their form. We also plan to use the tool for more complex case studies, specifically contract signing protocols.

Footnotes

Acknowledgments

This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No 645865-SPOOC), and from the German Federal Ministry of Education and Research (BMBF) within EC SPRIDE.

References

Abadi and

Cortier, Deciding knowledge in security protocols under equational theories, Theoretical Computer Science 387(1–2) (2006), 2–32. doi:10.1016/j.tcs.2006.08.032.

Abadi and

Fournet, Mobile values, new names, and secure communication, in: Proc. 28th ACM Symp. on Principles of Programming Languages (POPL’01), ACM Press, 2001, pp. 104–115.

Arapinis,

Ritter and

Ryan, Statverif: Verification of stateful processes, in: Proc. 24th IEEE Computer Security Foundations Symposium (CSF’11), IEEE Press, 2011, pp. 33–47.

Armando,

D.A.

Basin,

Boichut,

Chevalier,

Compagna,

Cuéllar,

Hankes Drielsma,

P.-C.

Héam,

Kouchnarenko,

Mantovani,

Mödersheim,

von Oheimb,

Rusinowitch,

Santiago,

Turuani,

Viganò and

Vigneron, The AVISPA tool for the automated validation of Internet security protocols and applications, in: Proc. 17th International Conference on Computer Aided Verification (CAV’05), LNCS, Springer, 2005, pp. 281–285.

Armando,

Carbone,

Compagna,

Cuellar and

Tobarra Abad, Formal analysis of SAML 2.0 web browser single sign-on: Breaking the SAML-based single sign-on for Google apps, in: Proc. 6th ACM Workshop on Formal Methods in Security Engineering (FMSE’08), 2008, pp. 1–10. doi:10.1145/1456396.1456397.

Bistarelli,

Cervesato,

Lenzini and

Martinelli, Relating multiset rewriting and process algebras for security protocol analysis, Journal of Computer Security 13(1) (2005), 3–47. doi:10.3233/JCS-2005-13102.

Blanchet, An efficient cryptographic protocol verifier based on prolog rules, in: Proc. 14th Computer Security Foundations Workshop (CSFW’01), IEEE Press, 2001, pp. 82–96.

Blanchet,

Smyth and

Cheval, ProVerif 1.88: Automatic cryptographic protocol verifier, user manual and tutorial, 2013.

Bond and

Anderson, API level attacks on embedded systems, IEEE Computer Magazine 34 (2001), 67–75. doi:10.1109/2.955101.

10.

Bortolozzo,

Centenaro,

Focardi and

Steel, Attacking and fixing PKCS#11 security tokens, in: Proc. 17th ACM Conference on Computer and Communications Security (CCS’10), ACM Press, 2010, pp. 260–269. doi:10.1145/1866307.1866337.

11.

CCA Basic Services Reference and Guide. CCA Basic Services Reference and Guide, October 2006, available online.

12.

Delaune,

Kremer,

M.D.

Ryan and

Steel, Formal analysis of protocols based on TPM state registers, in: Proc. 24th IEEE Computer Security Foundations Symposium (CSF’11), IEEE Press, 2011, pp. 66–82.

13.

Delaune,

Kremer and

Steel, Formal analysis of PKCS#11 and proprietary extensions, Journal of Computer Security 18(6) (2010), http://www.lsv.ens-cachan.fr/Publis/PAPERS/PDF/DKS-jcs09.pdf. doi:10.3233/JCS-2009-0394.

14.

Escobar,

Meadows and

Meseguer, Maude-npa: Cryptographic protocol analysis modulo equational properties, in: Foundations of Security Analysis and Design V, LNCS, Vol. 5705, Springer, 2009, pp. 1–50. doi:10.1007/978-3-642-03829-7_1.

15.

S.B.

Fröschle and

Sommer, Reasoning with past to prove PKCS#11 keys secure, in: Proc. 7th International Workshop on Formal Aspects in Security and Trust (FAST’10), LNCS, Vol. 6561, 2010, pp. 96–110. doi:10.1007/978-3-642-19751-2_7.

16.

J.A.

Garay,

Jakobsson and

P.D.

MacKenzie, Abuse-free optimistic contract signing, in: Advances in Cryptology – Crypto’99, LNCS, Vol. 1666, Springer, 1999, pp. 449–466. doi:10.1007/3-540-48405-1_29.

17.

J.D.

Guttman, State and progress in strand spaces: Proving fair exchange, J. Autom. Reasoning 48(2) (2012), 159–195. doi:10.1007/s10817-010-9202-1.

18.

Herzog, Applying protocol analysis to security device interfaces, IEEE Security & Privacy Magazine 4(4) (2006), 84–87. doi:10.1109/MSP.2006.85.

19.

Rakotonirina, Vérification automatique de protocoles de sécurité avec mémoire globale et boucles, Intership report, September 2014, URL http://www.dptinfo.ens-cachan.fr/~irakoton/stagel3/rapportl3.pdf.

20.

F.J.

Thayer Fabrega,

J.C.

Herzog and

J.D.

Guttman, Strand spaces: Proving security protocols correct, Journal of Computer Security 7(2–3) (1999), 191–230.

21.

Kremer and

Künnemann, Automated analysis of security protocols with global state, in: Proc. 35th IEEE Symposium on Security and Privacy (S&P’14), IEEE Computer Society Press, 2014, pp. 163–178. doi:10.1109/SP.2014.18.

22.

Künnemann, Automated backward analysis of PKCS#11 v2.20, in: Proc. 4th Conference on Principles of Security and Trust (POST’15), LNCS, Vol. 9036, Springer, 2015, pp. 219–238.

23.

Künnemann and

Steel, YubiSecure? Formal security analysis results for the Yubikey and YubiHSM, in: Proc. 8th Workshop on Security and Trust Management (STM’12), LNCS, Vol. 7783, 2012, pp. 257–272. doi:10.1007/978-3-642-38004-4_17.

24.

Longley and

Rigby, An automatic search for security flaws in key management schemes, Computers and Security 11(1) (1992), 75–89. doi:10.1016/0167-4048(92)90222-D.

25.

Lowe, Breaking and fixing the Needham–Schroeder public-key protocol using FDR, in: Proc. 2nd International Workshop on Tools and Algorithms for Construction and Analysis of Systems (TACAS’96), LNCS, Vol. 1055, Springer, 1996, pp. 147–166. doi:10.1007/3-540-61042-1_43.

26.

Mödersheim, Abstraction by set-membership: Verifying security protocols and web services with databases, in: Proc. 17th ACM Conference on Computer and Communications Security (CCS’10), ACM, 2010, pp. 351–360. doi:10.1145/1866307.1866348.

27.

PKCS #11: Cryptographic token interface standard, RSA Security Inc., v2.20, June 2004.

28.

J.D.

Ramsdell,

D.J.

Dougherty,

J.D.

Guttman and

P.D.

Rowe, A hybrid analysis for security protocols with state, in: Proc. 11th International Conference on Integrated Formal Methods (IFM’14), LNCS, Vol. 8739, Springer, 2014, pp. 272–287. doi:10.1007/978-3-319-10181-1.

29.

Schmidt, Formal analysis of key-exchange protocols and physical protocols, PhD thesis, ETH Zürich, 2012.

30.

Schmidt,

Meier,

Cremers and

Basin, Automated analysis of Diffie–Hellman protocols and advanced security properties, in: Proc. 25th IEEE Computer Security Foundations Symposium (CSF’12), IEEE Press, 2012, pp. 78–94.

31.

Schmidt,

Meier,

Cremers and

Basin, The tamarin prover for the symbolic analysis of security protocols, in: Proc. 25th International Conference on Computer Aided Verification (CAV’13), LNCS, Vol. 8044, Springer, 2013, pp. 696–701.

32.

Schmidt,

Sasse,

Cremers and

D.A.

Basin, Automated verification of group key agreement protocols, in: Proc. 35th IEEE Symposium on Security and Privacy (S&P’14), IEEE Computer Society Press, 2014, pp. 179–194. doi:10.1109/SP.2014.19.

33.

Shao,

Qin,

Feng and

Wang, Formal analysis of enhanced authorization in the TPM 2.0, in: Proc. 10th ACM Symposium on Information, Computer and Communications Security (ASIA CCS’15), ACM, 2015, pp. 273–284.

34.

The YubiKey Manual – Usage, configuration and introduction of basic concepts (Version 2.2), available at: http://www.yubico.com/documentation. Yubico AB, Kungsgatan 37, 111 56 Stockholm Sweden, June 2010.

35.

Trusted Computing Group, TPM specification version 1.2. Parts 1–3, revision 103, 2007, available at: http://www.trustedcomputinggroup.org/resources/tpm_main_specification.

36.

Yubico AB, Yubico customer list, 2014, URL https://www.yubico.com/about/reference-customers/, accessed: Do 13 November 2014, 08:33:34 CET.

Automated analysis of security protocols with global state

Abstract

Keywords

1. Introduction

2. Preliminaries

3.1. Syntax and informal semantics

Definition 1 (Deduction).

Definition 3 (Multiset rewrite rule).

Definition 4 (Labelled multiset rewriting system).

Definition 5 (Labelled transition relation).

Definition 6 (Executions).

Definition 7 (Traces).

5. Security properties

Definition 8 (Trace formulas).

Definition 9 (Satisfaction relation).

Definition 10 (Validity, satisfiability).

Example. The following trace formula expresses secrecy of keys generated on the security API, which we introduced in Section 3. ¬ ( ∃ h , k : msg , i , j : temp . NewKey ( h , k ) @ i ∧ K ( k ) @ j ) . 6. A translation from processes into multiset rewrite rules

6.1. Definition of the translation of processes

Definition 12 (Process annotation).

Definition 13 (Well-formed).

7.2. Yubikey

7.3. The GJM contract signing protocol [16]

7.4. Further case studies

7.5. Heuristics

7.6. Proof effort

8. Conclusion

Footnotes

Acknowledgments

References

Example.
The following trace formula expresses secrecy of keys generated on the security API, which we introduced in Section 3. $\begin{matrix} \neg (\exists h, k : msg, i, j : temp . NewKey (h, k) @ i \land K (k) @ j) . \end{matrix}$

6. A translation from processes into multiset rewrite rules