Proved generation of implementations from computationally secure protocol specifications 1

Abstract

In order to obtain implementations of security protocols proved secure in the computational model, we previously proposed the following approach: we write a specification of the protocol in the input language of the computational protocol verifier CryptoVerif, prove it secure using CryptoVerif, then generate an OCaml implementation of the protocol from the CryptoVerif specification using a specific compiler that we have implemented. However, until now, this compiler was not proved correct, so we did not have real guarantees on the generated implementation. In this paper, we fill this gap. We prove that this compiler preserves the security properties proved by CryptoVerif: if an adversary has probability p of breaking a security property in the generated code, then there exists an adversary that breaks the property with the same probability p in the CryptoVerif specification. Therefore, if the protocol specification is proved secure in the computational model by CryptoVerif, then the generated implementation is also secure.

Keywords

Cryptographic protocol computational model implementation compiler CryptoVerif OCaml verification

1. Introduction

The verification of security protocols is an important research area since the 1990s: the design of security protocols is notoriously error-prone, and errors can have serious consequences. Formal verification first focused on verifying formal specifications of protocols. However, verifying a specification does not guarantee that the protocol is correctly implemented from this specification. It is therefore important to make sure that the implementation is secure, and not only the specification. Moreover, two models were considered for verifying protocols. In the symbolic model, the so-called Dolev–Yao model, messages are terms. This abstract model facilitates automatic proofs. In contrast, in the computational model, typically used by cryptographers, messages are bitstrings and attackers are polynomial-time probabilistic Turing machines. Proofs in the latter model are more difficult than in the former, but yield a much more precise analysis of the protocol. Therefore, we would like to obtain implementations of protocols proved secure in the computational model.

To reach this goal, we proposed the following approach in [10]. We start from a formal specification of the protocol. In order to prove the specified protocol secure in the computational model, we rely on the automatic protocol verifier CryptoVerif [7–9]. This verifier can prove secrecy and authentication properties. The generated proofs are proofs by sequences of games, like the manual proofs written by cryptographers. These games are formalized in a probabilistic process calculus. The specification of the protocol given as input to CryptoVerif then consists of a process representing the protocol to prove (the initial game of the proof), assumptions on the cryptographic primitives (such as “encryption is IND-CPA” and “decrypting a ciphertext with the correct key yields the initial cleartext”), and the security properties to prove. CryptoVerif then looks for a proof of the desired security properties, and when it finds one, it also provides a formula that bounds the probability of success of an attack against the desired properties as a function of the runtime of the adversary, the number of sessions of the protocol, and the probability of breaking each primitive.

In order to obtain a proved implementation from the specification, we have written a compiler that takes a CryptoVerif specification and returns an implementation in the functional language OCaml (http://caml.inria.fr). This compiler starts from a CryptoVerif specification annotated with implementation details. The annotations specify how to divide the protocol in different roles, for example, key generation, server, and client, and how to implement the various cryptographic primitives and types. They also specify which CryptoVerif variables should be written into files, because they are communicated from one role to another. For instance, the key generation typically writes long-term keys into files, so that they can be used by subsequent roles. The compiler then generates an OCaml module for each role in the input file. In order to get a full implementation of the protocol, this module is combined with manually written code, responsible in particular for sending and receiving messages from the network, which we call the network code. For instance, in the case of the client–server protocol, both the client and server programs consist of a mix of our generated modules, which deal with the heart of the cryptographic protocol, and manually written network code, which deals with non-cryptographic details.

To make sure that the generated implementation is actually secure, we need to prove the correctness of our compiler. This proof was still missing in [10]. It is the topic of this paper. To make this proof, we need a formal semantics of OCaml. We adapt the operational small-step semantics of a core part of OCaml by Owens et al. [17,18]. We add to this language support for simplified modules, multiple threads where only one thread can run at any given time, and communication between threads by a shared part of the store.

An adversary against the generated implementation is an OCaml program using the modules generated by our compiler. On the CryptoVerif side, an adversary is a process running in parallel with the verified protocol. In our proof, for each OCaml adversary, we construct a corresponding CryptoVerif adversary that simulates the behavior of the OCaml adversary. When the OCaml adversary calls one of the functions generated by our compiler, which comes from an oracle in the CryptoVerif process, the CryptoVerif adversary calls this oracle. Then we establish a precise correspondence between the traces of the CryptoVerif process with that CryptoVerif adversary and the traces of the OCaml program. This correspondence allows us to show that the probability of success of an attack is the same on the CryptoVerif side and on the OCaml side. Therefore, if CryptoVerif proves that the protocol is secure, then the generated OCaml implementation is also secure, and the bound on the probability of success of an attack computed by CryptoVerif is also valid for the implementation.

We have made several assumptions to obtain this proof; the most important ones are:

The random number generator used by the OCaml cryptographic library is perfect.

The implementation of each cryptographic primitive is a pure function and satisfies the assumptions made on it in the specification.

The roles are executed in the order specified in CryptoVerif (e.g., in a key-exchange protocol, the key generation is called before the servers and clients).

The adversary and the network code do not access files created by our implementation (e.g. private key files).

The network code is a well-typed OCaml program, which does not use unsafe OCaml functions to bypass the type system.

The network code does not mutate data passed to or received from generated code. This property can be guaranteed by representing such data by immutable OCaml types. However, such data includes bitstrings and the most natural type for representing bitstrings is the OCaml type string, which is mutable.2

²
From version 4.02, OCaml has a command-line option that makes string immutable.

Immutable strings can be implemented in OCaml using an abstract type instead of string. In our semantics, strings are immutable values.

Our semantics of threads is obeyed, which implies that only one thread can run at any given time and that one cannot fork in the middle of a role.

Because the network code and our generated modules run inside the same programs, we use assumptions (A5) and (A6) to make sure that the network code does not interfere with the generated code. In particular, Assumption (A5) prevents the network code from accessing the variables contained in the environment of functions returned by our generated code. These variables may contain secret keys, which the network code could send to the adversary if it had access to them. Moreover, our generated code may return both a public key and a function that includes this public key in its environment. If the network code could modify the returned public key, it would modify the key used by the function as well, so the protocol would use an unexpected public key. Assumption (A6) avoids that. Assumptions (A5) and (A6) are the only requirements on the network code needed to prove security so, except for these two assumptions, we consider the network code as part of the adversary. In Assumption (A7), the requirement that only one thread can run at any given time can be weakened as we discuss informally at the end of Section 5.2.5: the essential requirement is that two processes that read or write the same file are not run concurrently, which can be enforced using locks. Assumption (A7) also limits forking: forking is allowed when the local store is empty. In case one needs to fork in the middle of a role, one can split the role into two, which has the effect of transmitting the store via files between the two roles. It may also be possible to extend our result with an explicit fork instruction in the OCaml language.

Assumptions (A1), (A5) and (A7) are built into our semantics of OCaml, defined in Section 5. Assumption (A2) is formalized below by Assumption 8.4, with additional technical details formalized in Assumptions 8.1 and 8.2. Assumptions (A3), (A4) and (A6) are formalized by Assumptions 6.1, 7.2 and 8.3, respectively.

In this work, we do not consider side-channel attacks, such as timing and power consumption attacks, nor physical attacks. Like other mechanized tools for cryptographic proofs, CryptoVerif does not deal with these attacks.

Related work.

Several approaches have been considered in order to obtain proved implementations of security protocols. In the symbolic model, several approaches generate protocols from specifications, e.g. [16,19]. Other approaches analyze implementations by extracting a specification verified by a symbolic protocol verifier, e.g. [2,6], or analyze them by other tools such as the model-checker ASPIER [12], the general-purpose C verifier VCC [14], symbolic execution [13] or typing [5,20].

In contrast, the following approaches provide computational security guarantees, by analyzing implementations. The tool FS2CV [1] translates a subset of F# to the input language of CryptoVerif, which can then prove the protocol secure. The tool F7 [5], which uses a dependent type system to prove security properties on protocols implemented in F#, has been adapted to the computational model in [15]; it uses type annotations to help the proof. The symbolic execution approach of [2] provides computational security guarantees by applying a computational soundness result, which however restricts the class of protocols that can be considered. The tool of [3] generates a CryptoVerif model from a C implementation; however, it can analyze only a single execution path.

Recently, Almeida et al. [4] introduced a new approach for generating implementations with a computational proof. They extend the cryptographic prover EasyCrypt to support C-like programs, then they generate proved assembly code using an extended version of the CompCert certified C compiler. They mainly target cryptographic primitives (for instance, OAEP), and using EasyCrypt requires the user to give the games of the cryptographic proof, while in our approach CryptoVerif generates them.

To the best of our knowledge, our approach and that of [4] are the only ones for generating implementations with a computational proof. References [3,4] and our work are the only ones to provide an explicit bound on the probability of success of an attack against the verified protocol implementation.

2. Intuitive overview

In order to prove the correctness of a compiler, we first need a formal semantics of the source and target languages, and a formal definition of the compiler. Handling all this formalism is probably the main challenge of this paper; it explains its length.

After introducing some notations (Section 3), our first task is to formally define the common input language of CryptoVerif and of our compiler (Section 4). We define the semantics of this language as a probabilistic transition system on semantic configurations. CryptoVerif uses events to define security properties. For instance, a security property may be “if event $Baccepts (m^{'})$ has been executed, then event $Asends (m^{'})$ has also been executed”. For each security property, we define a distinguisher D that is true when the executed sequence of events breaks the security property. We denote by $Pr [C_{i} (Q_{0} ∣ Q_{adv}) :^{(CV)} D]$ the probability that the security property associated to D is broken starting from the initial configuration $C_{i} (Q_{0} ∣ Q_{adv})$ , which runs the protocol $Q_{0}$ in parallel with the adversary $Q_{adv}$ . In other words, $Pr [C_{i} (Q_{0} ∣ Q_{adv}) :^{(CV)} D]$ is the probability that the adversary $Q_{adv}$ breaks the desired security property of the protocol $Q_{0}$ . When it proves the security property, CryptoVerif provides a formula that bounds this probability for any adversary $Q_{adv}$ , as a function of the runtime of the adversary, the execution time of the cryptographic primitives and of various CryptoVerif constructs, the number of calls to each oracle, the probability of collisions between random numbers, and the probability of breaking each primitive.

Section 5 defines the OCaml language. We rely on the operational small-step semantics of a core part of OCaml by Owens et al. [17,18], which we adapt to our setting. We add a primitive for random choices, which makes the semantics probabilistic. We also add support for simplified modules, multiple threads, and communication between threads by a shared part of the store. We adopt a simplified model of parallelism: only one thread runs at a time, and the adversary is in charge of scheduling. This model of parallelism is close to what happens in CryptoVerif; we explain informally why it is sufficient for our purpose in Section 5. Like the semantics of the CryptoVerif input language, the semantics of OCaml is defined as a probabilistic transition system on semantic configurations.

In order to prove our compiler, we instrument OCaml code in three ways (Section 6). We add events to the language, so that we can specify security properties in OCaml as we do in CryptoVerif. We introduce tagged functions and closures, which have the same semantics as ordinary functions and closures, but contain additional tags used in our code generation to indicate from which role or oracle the function comes. Each CryptoVerif role is translated by our compiler into an OCaml module; we add to the OCaml semantics a multiset of callable modules, which indicates which modules can be called to guarantee that only allowed roles are executed, as required by Assumption (A3). We show that this instrumentation does not alter the semantics of OCaml: an instrumented program behaves exactly in the same way as that program with the instrumentation deleted, provided only allowed roles are executed. More precisely, we show a weak bisimulation between the non-instrumented and the instrumented semantics.

Section 7 defines the translation from CryptoVerif to OCaml. In this translation, each role generates a module, and the oracles are represented by closures. Basically, the translation implements in OCaml the semantics of CryptoVerif given in Section 4. The translation is the same as the one given [10], except that the generated OCaml code is instrumented. The generated modules are combined with manually written network code (which is in particular responsible for inputting and outputting messages on the network) to produce the complete programs that implement the protocol. These programs are run in interaction with an adversary, which we also represent by an OCaml program. This is possible because OCaml with random choices is probabilistic Turing complete. The code generated from the CryptoVerif process $Q_{0}$ , the network code, and the adversary are all grouped into the OCaml program ${program}_{0}$ , and we denote by $C_{0} (Q_{0}, {program}_{0})$ the initial (instrumented) OCaml semantic configuration that runs ${program}_{0}$ . The probability $Pr [C_{0} (Q_{0}, {program}_{0}) :^{(ML)} D]$ denotes the probability that the OCaml adversary defined in ${program}_{0}$ breaks the security property associated to the distinguisher D of the protocol $Q_{0}$ .

Section 8 proves the correctness of this compiler. Informally, when CryptoVerif shows that $Q_{0}$ satisfies a certain security property, it shows that for any CryptoVerif adversary $Q_{adv}$ , the probability $Pr [C_{i} (Q_{0} ∣ Q_{adv}) :^{(CV)} D]$ is bounded by a certain bound. Our goal is to show that the same probability bound also applies to the generated implementation, that is, the probability $Pr [C_{0} (Q_{0}, {program}_{0}) :^{(ML)} D]$ that ${program}_{0}$ breaks the security property is bounded by the same bound for any ${program}_{0}$ .

Fig. 1.

Overview of our proof.

The presence of an arbitrary adversary complicates the proof. As illustrated in Fig. 1 and detailed in Section 8.3, to solve this problem, we build from the OCaml adversary defined in ${program}_{0}$ a CryptoVerif adversary $Q_{adv} (Q_{0}, {program}_{0})$ that simulates ${program}_{0}$ . Basically, we run the OCaml program ${program}_{0}$ inside a CryptoVerif function ${simulate}_{ML}$ . Since these functions can represent deterministic Turing machines, when ${program}_{0}$ needs to generate a random number, the function ${simulate}_{ML}$ returns and this generation is performed by CryptoVerif. Similarly, when ${program}_{0}$ would call the generated implementation of an oracle, the function ${simulate}_{ML}$ returns and $Q_{adv} (Q_{0}, {program}_{0})$ calls the corresponding CryptoVerif oracle in $Q_{0}$ .

The initial CryptoVerif configuration is then $C_{0} (Q_{0}, {program}_{0}) = C_{i} (Q_{0} ∣ Q_{adv} (Q_{0}, {program}_{0}))$ . We prove that, for all protocols $Q_{0}$ , OCaml adversaries defined in ${program}_{0}$ , and distinguishers D, we have $Pr [C_{0} (Q_{0}, {program}_{0}) :^{(CV)} D] = Pr [C_{0} (Q_{0}, {program}_{0}) :^{(ML)} D] .$ (1) From this property, it is easy to see that, if CryptoVerif bounds the probability $Pr [C_{i} (Q_{0} ∣ Q_{adv}) :^{(CV)} D]$ for any adversary $Q_{adv}$ for $Q_{0}$ , then the same bound also holds for the probability $Pr [C_{0} (Q_{0}, {program}_{0}) :^{(ML)} D]$ corresponding to the generated implementation. Indeed, $Pr [C_{0} (Q_{0}, {program}_{0}) :^{(ML)} D] = Pr [C_{0} (Q_{0}, {program}_{0}) :^{(CV)} D] = Pr [C_{i} (Q_{0} ∣ Q_{adv} (Q_{0}, {program}_{0})) :^{(CV)} D]$ and $Q_{adv} (Q_{0}, {program}_{0})$ is an adversary for $Q_{0}$ .

To prove (1), we basically need to show that $Q_{0} ∣ Q_{adv} (Q_{0}, {program}_{0})$ and ${program}_{0}$ using the code generated from $Q_{0}$ behave similarly. This proof proceeds in several steps:

First, we state our assumptions on the implementation of the cryptographic primitives, and show that the primitives behave correctly independently of the rest of the program (Section 8.1).

Second, we prove that the OCaml translation of a CryptoVerif oracle behaves like the oracle (Section 8.2).

Finally, in Section 8.4, we prove that the CryptoVerif adversary interacting with $Q_{0}$ behaves like the OCaml adversary interacting with the generated implementation. This proof is done by establishing a precise relation between the CryptoVerif and OCaml semantic configurations.

Therefore, we obtain the desired proof of (1) (Theorem 8.38). Because of the length of this proof, details are postponed to the Appendices (see the Supplemental material). An index of notations can be found in Appendix H (see the Supplemental material).

3. Notations

Let us introduce some basic notations. When f is a function, we denote by $Dom (f)$ the domain of f, that is, the set of elements x such that $f (x)$ is defined. We denote by $f [x \mapsto y]$ the function $f^{'}$ defined by $f^{'} (x) = y$ and $f^{'} (x^{'}) = f (x^{'})$ for $x^{'} \neq x$ . When $f_{1}$ and $f_{2}$ are functions with disjoint domains, we denote by $f_{1} \cup f_{2}$ the function $f^{'}$ defined by $f^{'} (x) = f_{1} (x)$ if $x \in Dom (f_{1})$ and $f^{'} (x) = f_{2} (x)$ if $x \in Dom (f_{2})$ . When $f_{1}$ and $f_{2}$ are functions, we write $f_{1} \subseteq f_{2}$ (or $f_{2} \supseteq f_{1}$ ) when $Dom (f_{1}) \subseteq Dom (f_{2})$ and, for all $x \in Dom (f_{1})$ , we have $f_{2} (x) = f_{1} (x)$ . We denote by ∅ any function whose domain is the empty set ∅.

We use the following notations for lists. Let $[]$ be the empty list, and $x : : l$ be the list obtained by adding the element x to the list l. Let $[x_{1}; \dots; x_{k}]$ be the list $x_{1} : : \dots : : x_{k} : : []$ . Let $[x \in l ∣ Prop (x)]$ be the list containing all elements x of l that satisfy the property $Prop (x)$ , in the same order as in l. This construct is defined by induction on lists: $\begin{array}{rcl} [x \in [] ∣ Prop (x)] \overset{def}{=} [], \\ [x \in y : : l ∣ Prop (x)] \overset{def}{=} \{\begin{matrix} [x \in l ∣ Prop (x)] & if \neg Prop (y), \\ y : : [x \in l ∣ Prop (x)] & otherwise . \end{matrix} \end{array}$ The concatenation of lists $l_{1} @ l_{2}$ is the list containing all elements of $l_{1}$ followed by all elements of $l_{2}$ . The membership test $x \in l$ is true when l contains the element x, and false otherwise. Let $| l |$ be the length of the list l, and $nth (l, n)$ be the nth element of list l.

We define the function $almostunif (A, b)$ as the probability that the element $b \in A$ is chosen among elements of the set A, according to an almost uniform distribution: we require that, for every set A, $\sum_{b \in A} almostunif (A, b) = 1$ , $almostunif (A, b) > 0$ for all $b \in A$ , and $\sum_{b \in A} | almostunif (A, b) - \frac{1}{| A |} | ⩽ ε$ for some $ε > 0$ . Indeed, probabilistic Turing machines can choose random elements uniformly only in sets of cardinal a power of 2. For other sets, they can choose random elements with a probability distribution as close as we wish to uniform, that is, we can make ε as small as we wish in the formula above.

Fonts.

We use a sans-serif font for CryptoVerif keywords (e.g., $foreach$ ) and role names (e.g., $keygen$ ), a roman font for CryptoVerif function, constant, event, and oracle symbols, and an italic font for CryptoVerif types, variables and file names. We use a bold font for OCaml keywords (e.g., $match$ ) and constructors (e.g., $Callable$ ), and an italic font for OCaml types and other identifiers. We use uppercase italic letters (e.g., E, P) and a calligraphic font (e.g., $C$ ) for CryptoVerif semantic elements, while we use lowercase italic words (e.g., $env$ ) and a blackboard font (e.g., $C$ ) for OCaml semantic elements. We use an italic font for most other mathematical symbols, and a sans-serif font for constant elements.

4. The CryptoVerif input language

This section presents the syntax and semantics of the CryptoVerif input language, as well as the annotations that specify implementation details. CryptoVerif supports two input languages: the channel and oracle front-ends. The channel front-end [7] uses channels to pass data between the adversary and the protocol, and the oracle front-end [9] defines oracles that can be called by the adversary. In this paper, we focus on the oracle front-end, which is closer to the syntax of games used by cryptographers; oracles are also easier to translate into OCaml functions. (Our compiler also supports the channel front-end.) We adapt the semantics given in [7] for the channel front-end to the oracle front-end.

4.1. Syntax and informal semantics

Fig. 2.

Syntax of the CryptoVerif language.

Let us first introduce the syntax of the CryptoVerif language in Fig. 2. The language is typed, and types T are subsets of ${bitstring}_{⊥} \overset{def}{=} bitstring \cup {⊥}$ where $bitstring$ is the set of all bitstrings and ⊥ is a symbol that is not a bitstring, used, for example, to represent the failure of a decryption. The boolean type $bool \overset{def}{=} {true, false}$ , where $true$ is the bitstring 1 and $false$ 0, and the types $bitstring$ and ${bitstring}_{⊥}$ are predefined.

Variables $x [i_{1}, \dots, i_{m}]$ are arrays of bitstrings of a given type T. As formalized by Property 4.3, each variable $x [i_{1}, \dots, i_{m}]$ has a single definition and the indices $i_{1}, \dots, i_{m}$ are the indices of the replications $foreach i_{m} ⩽ N_{m} do \dots foreach i_{1} ⩽ N_{1} do Q$ present above the definition of x: each replication $foreach i ⩽ N do Q$ creates N copies of the process Q, in which i is set to $1, \dots, N$ , respectively. Then the indices $i_{1}, \dots, i_{m}$ have different values in different executions of the definition of x, so that each cell of the array x is assigned at most once. Therefore, arrays allow us to remember all values of the variables during the execution of the process. We call the indices $i_{1}, \dots, i_{m}$ replication indices, and we abbreviate $i_{1}, \dots, i_{m}$ by $\tilde{i}$ . The indices $i_{1}, \dots, i_{m}$ are ordered from the inner-most to the outer-most replication. Since the indices of x are entirely determined by the replications above the definition of x, we often omit them to lighten notations. Each function f comes with its type $T_{1} \times \dots \times T_{m} \to T$ ; all CryptoVerif functions are deterministic and efficiently computable. Some functions are predefined, and some are infix, like the equality test = and boolean operations. The cryptographic primitives used in the protocol are represented by CryptoVerif functions. Terms M represent computations over bitstrings: they can be variable accesses $x [i_{1}, \dots, i_{m}]$ or function applications $f (M_{1}, \dots, M_{m})$ .

The oracle definitions Q represent the oracles that will become available to the adversary at this point. The nil construct 0 provides no oracle. The parallel composition $Q ∣ Q^{'}$ provides oracles in Q and $Q^{'}$ . The replication $foreach i ⩽ N do Q$ provides N copies of Q, indexed by $i \in {1, \dots, N}$ . The bound N is unspecified and is used by CryptoVerif to express the maximum probability of breaking the protocol, which typically depends on the number of calls to the various oracles. The oracle definition $O [\tilde{i}] (x_{1} [\tilde{i}] : T_{1}, \dots, x_{k} [\tilde{i}] : T_{k}) : = P$ makes available the oracle $O [\tilde{i}]$ ; when $O [\tilde{i}]$ is called by the adversary with arguments $a_{1}, \dots, a_{k}$ , it executes the oracle body P with $x_{j} [\tilde{i}]$ set to $a_{j}$ .

The oracle bodies P represent the behavior of the oracle. A return statement $return (M_{1}, \dots, M_{k}); Q$ returns the result of $M_{1}, \dots, M_{k}$ to the caller, and makes available oracles in Q. An end statement $end$ returns to the caller with an error. A random number assignment $x [\tilde{i}] \overset{R}{\leftarrow} T; P$ stores a uniformly chosen random value of type T in variable $x [\tilde{i}]$ , and continues by executing P. The type T must consist of all bitstrings of a given size; in this case, we say that T is a fixed-length type. An assignment $x [\tilde{i}] \leftarrow M; P$ puts the result of M in the variable $x [\tilde{i}]$ , and continues by executing P. A conditional statement $if M then P else P^{'}$ executes P if M evaluates to $true$ and $P^{'}$ otherwise.

An insert statement $insert Tbl (M_{1}, \dots, M_{k}); P$ inserts the result of $M_{1}, \dots, M_{k}$ into the table $Tbl$ . Tables are lists of tuples, used for example to store tables of keys. Each table $Tbl$ has a type $T_{1} \times \dots \times T_{k}$ , which means that $Tbl$ contains k-tuples $a_{1}, \dots, a_{k}$ such that $a_{j}$ is of type $T_{j}$ for all $j ⩽ k$ . A get statement $get Tbl (x_{1} [\tilde{i}], \dots, x_{k} [\tilde{i}]) suchthat M in P else P^{'}$ searches for an element $a_{1}, \dots, a_{k}$ in the table $Tbl$ such that the term M evaluates to $true$ when $x_{1} [\tilde{i}] = a_{1}, \dots, x_{k} [\tilde{i}] = a_{k}$ . If there is no such element, we continue by executing $P^{'}$ , and otherwise we choose almost-uniformly one of the elements that correspond, store it in the variables $x_{1} [\tilde{i}], \dots, x_{k} [\tilde{i}]$ , then execute P. An event statement $event ev (M_{1}, \dots, M_{k}); P$ is used to log events. Events serve for specifying security properties of protocols, but do not change the execution of the process.

An oracle call $let (x_{1} [\tilde{i}] : T_{1}, \dots, x_{k^{'}} [\tilde{i}] : T_{k^{'}}) = O [M_{1}, \dots, M_{l}] (M_{1}^{'}, \dots, M_{k}^{'}) in P else P^{'}$ calls oracle $O [M_{1}, \dots, M_{l}]$ with arguments $M_{1}^{'}, \dots, M_{k}^{'}$ , stores its returned values in the variables $x_{1} [\tilde{i}], \dots, x_{k^{'}} [\tilde{i}]$ , and continues by executing P if the oracle terminates with a $return$ statement, or continues by executing $P^{'}$ if the oracle terminates with $end$ .

A loop $let x [\tilde{i}] : T = loop O [M_{1}, \dots, M_{l}] (M^{'}) in P else P^{'}$ calls oracle O in a loop. Oracle O takes a unique argument (the internal state of the loop) and returns a pair containing the modified internal state of the loop and a boolean b indicating whether the loop should continue or not. For clarity, we use $continue$ as a synonym for $true$ and $stop$ for $false$ in this context. $O [M_{1}, \dots, M_{l}] (M^{'})$ is first called. If it returns $(a_{1}, continue)$ , $O [M_{1} + 1, M_{2}, \dots, M_{l}] (a_{1})$ is called. If it returns $(a_{2}, continue)$ , $O [M_{1} + 2, M_{2}, \dots, M_{l}] (a_{2})$ is called, and so on, until $O [M_{1} + k, M_{2}, \dots, M_{l}] (a_{k})$ returns $(a_{k + 1}, stop)$ . Then we run P with $x [\tilde{i}]$ set to $a_{k + 1}$ . If O terminates with $end$ , we run $P^{'}$ . Oracle call and loop statements cannot appear in the CryptoVerif process representing the protocol, but are used for representing the adversary. Some protocols use loops or recursion, for instance for certificate checking; such protocols could in principle be encoded in our language by using replicated processes and transmitting internal state from one iteration to the next using a table or an encrypted message. However, this idea leads to contrived encodings and a native loop construct would be more convenient. Including loops in protocols would not cause major problems for generating implementations, but would considerably complicate the prover CryptoVerif itself, since it would have to discover loop invariants. That is why we leave the inclusion of loops in protocols for future work.

Example 4.1.

Let us consider a simple protocol in which the first participant Alice generates a nonce m, sends it to the second participant Bob with a signature of the nonce under Alice’s signature key $sk$ . Bob then verifies that the signature is correct using Alice’s public key $pk$ . This protocol can be described by the following CryptoVerif process: $\begin{array}{rcl} Okeygen () : = rk \overset{R}{\leftarrow} keyseed; pk \leftarrow pkgen (rk); sk \leftarrow skgen (rk); return (pk); \\ (foreach i_{1} ⩽ N_{1} do P_{A} ∣ foreach i_{2} ⩽ N_{2} do P_{B}) \\ P_{A} \overset{def}{=} OA () : = m \overset{R}{\leftarrow} nonce; s \overset{R}{\leftarrow} seed; event Asends (m); return (m, sign (m, sk, s)) \\ P_{B} \overset{def}{=} OB (m^{'} : nonce, s^{'} : signature) : = \\ if check (m^{'}, pk, s^{'}) then (event Baccepts (m^{'}); return ()) else end \end{array}$ The only callable oracle at the beginning is the oracle $Okeygen$ , which generates the signature key pair $(pk, sk)$ by first generating a random seed $rk$ and applying the key generation algorithms $pkgen$ and $skgen$ to it. We return to the attacker the public key, so that the attacker can check whether a signature signed with the signature key $sk$ is correct. When the oracle $Okeygen$ returns, one can call the oracle $OA$ $N_{1}$ times, and the oracle $OB$ $N_{2}$ times.

The oracle $OA$ generates a random nonce m and a random seed s. Then, it executes the event $Asends (m)$ . This event just records that A sends the nonce m, without changing the execution of the process; we use it below to specify a security property. Finally, $OA$ returns the nonce m and the signature of the nonce m under the signature key $sk$ with the random seed s.

The oracle $OB$ takes as arguments a nonce $m^{'}$ and a signature $s^{'}$ , which should be the elements returned by a call to oracle $OA$ , and checks using the function $check$ whether the signature $s^{'}$ is indeed a correct signature of the message $m^{'}$ under the signature key $sk$ by using the public key $pk$ . If the signature is correct, the oracle executes the event $Baccepts (m^{'})$ and returns normally. Otherwise, the oracle terminates with $end$ .

The goal of this protocol is to guarantee that, with high probability, if B accepts a nonce $m^{'}$ , then A sent this nonce $m^{'}$ , that is, if event $Baccepts (m^{'})$ has been executed, then event $Asends (m^{'})$ has also been executed. This property is proved by CryptoVerif when signatures are unforgeable under chosen-message attacks (UF-CMA), as detailed in Example 4.9.

Example 4.2.

The previous toy example is not very realistic, in particular because B accepts messages only from A. In a more realistic setting, B could be a server that would process messages coming from several different clients. B would then use a table of keys to relate the identity of each client to its public key. We would then use the following process: $\begin{array}{rcl} Okeygen () : = rk \overset{R}{\leftarrow} keyseed; pk \leftarrow pkgen (rk); sk \leftarrow skgen (rk); insert KeyTbl (A, pk); return (pk); \\ (foreach i_{1} ⩽ N_{1} do P_{A} ∣ foreach i_{2} ⩽ N_{2} do P_{B} ∣ foreach i_{3} ⩽ N_{3} do P_{R}) \\ P_{A} \overset{def}{=} OA () : = m \overset{R}{\leftarrow} nonce; s \overset{R}{\leftarrow} seed; event Asends (m); return (A, m, sign (m, sk, s)) \\ P_{B} \overset{def}{=} OB (h^{'} : host, m^{'} : nonce, s^{'} : signature) : = \\ get KeyTbl (h, pkh) suchthat h^{'} = h in \\ if check (m^{'}, pkh, s^{'}) then (event Baccepts (h^{'}, m^{'}); return ()) else end \\ P_{R} \overset{def}{=} OR (h : host, pkh : pkey) : = if h \neq A then insert KeyTbl (h, pkh) \end{array}$ When A’s key pair is created, the pair $(A, pk)$ is added to the key table $KeyTbl$ , to record that $pk$ is the public key of A. The additional oracle $OR$ allows the adversary to record its own public keys in the key table for any host name other than the honest host A. Hence, the model allows B to interact both with the honest participant A and with any other dishonest participants. The message sent by A additionally contains the host name A, and B uses the host name $h^{'}$ to get the corresponding key $pkh$ , which he uses to verify the signature. The event $Baccepts$ also contains $h^{'}$ as additional argument: it means that B accepts the message $m^{'}$ coming from $h^{'}$ . The desired security property is that, with high probability, if B accepts the message $m^{'}$ coming from A, then A sent the message $m^{'}$ , that is, if event $Baccepts (A, m^{'})$ has been executed, then event $Asends (m^{'})$ has also been executed.

Tables of keys appear in many realistic protocols. For instance, the SSH client stores a table that contains the public keys and the names of the servers it connected to.

4.2. Formal semantics

Fig. 3.

Semantics (1).

We present the semantics of the language in Figs 3 and 4. The semantics is defined as a reduction relation on semantic configurations, which are tuples of the form $C = E, P, T, Q, S, E$ .

The environment E is a mapping from array cells $x [\tilde{a}]$ to their contents, where x is a variable, $\tilde{a}$ gives the value of its replication indices, and the contents of $x [\tilde{a}]$ is a bitstring value. The environment keeps every binding ever bound, thanks to replication indices, so it is ever increasing.

The oracle body P is the oracle body currently running.

The mapping $T$ maps table names to their contents, which is the list of elements inserted in the table.

The set $Q$ contains the set of the callable oracle definitions.

The list $S$ is the call stack, which consists of triplets containing the variables with which the result should be bound and two oracle bodies, the first will be executed if the oracle returns a result with a $return$ statement, and the second will be executed if the oracle terminates with an $end$ statement.

The list $E$ is the list of events $ev (a_{1}, \dots, a_{k})$ executed so far, by the construct $event ev (M_{1}, \dots, M_{k})$ .

During execution, terms may be reduced into constant bitstrings, so we add constant bitstrings a to the grammar of terms M. The notation $E \cdot M ⇓ a$ means that the term M evaluates to the bitstring a under the environment E. This relation is defined by rules (Cst), (Var) and (Fun) in Fig. 3.

Fig. 4.

Semantics (2).

The semantics is defined by probabilistic reduction rules between configurations: $C \to_{p} C^{'}$ means that $C$ reduces into $C^{'}$ with probability p. This relation is defined in the part “Oracle bodies (1)” of Fig. 3 and in Fig. 4.

The rule (New) evaluates $x [{\tilde{a}}^{'}] \overset{R}{\leftarrow} T$ by choosing an element $a \in T$ and storing it in $E (x [{\tilde{a}}^{'}])$ . The element $a \in T$ is chosen uniformly, so the probability of each choice is $1 / | T |$ and this is possible only when T is a fixed-length type. The rule (Let) evaluates the term M and stores its value in $E (x [{\tilde{a}}^{'}])$ . The rules (If1) and (If2) are straightforward.

The rules (Insert), (Get1) and (Get2) deal with tables of keys. The rule (Insert) evaluates the inserted element and adds it to the table $Tbl$ , by adding it to the list $T (Tbl)$ . The rules (Get1) and (Get2) compute the list of elements that satisfy the condition of the $get$ . When this list is empty, the $else$ branch is taken by rule (Get2). When this list is not empty, the rule (Get1) chooses an element of this list l, stores it in $E (x_{1} [{\tilde{a}}^{'}]), \dots, E (x_{k} [{\tilde{a}}^{'}])$ , and takes the $in$ branch. The jth element of the list l is chosen with probability $almostunif ({1, \dots, | l |}, j)$ . In case the same element $a_{1}^{0}, \dots, a_{k}^{0}$ occurs several times in the list l, the probability of choosing that element is the sum of the probabilities of all its occurrences. The probability of choosing $a_{1}^{0}, \dots, a_{k}^{0}$ is then close to $m / | l |$ , where m is the number of times this element appears in l.

The rule (Call) implements the oracle call $let (x_{1} [\tilde{a}] : T_{1}, \dots, x_{k^{'}} [\tilde{a}] : T_{k^{'}}) = O [M_{1}, \dots, M_{l}] (N_{1}, \dots, N_{k}) in P else P^{'}$ . It evaluates the indices $M_{1}, \dots, M_{l}$ of the oracle to call into ${\tilde{a}}^{'}$ and its arguments $N_{1}, \dots, N_{k}$ into $b_{1}, \dots, b_{k}$ ; after evaluation, we want to call the oracle $O [{\tilde{a}}^{'}] (b_{1}, \dots, b_{k})$ . Then, it looks for the definition $Q_{0}$ of the oracle $O [{\tilde{a}}^{'}]$ in the callable oracles $Q$ . It calls $Q_{0}$ by removing it from the callable oracles, storing $b_{1}, \dots, b_{k}$ in the arguments of $Q_{0}$ , and running its body $P^{″}$ . The element $((x_{1} [\tilde{a}], \dots, x_{k^{'}} [\tilde{a}]), P, P^{'})$ is pushed on the stack $S$ : $x_{1} [\tilde{a}], \dots, x_{k^{'}} [\tilde{a}]$ are the variables in which the return value of $Q_{0}$ should be stored, P is the process to execute when $Q_{0}$ returns, and $P^{'}$ is the process to execute when $Q_{0}$ terminates with $end$ .

The rule (Return) pops an element $((x_{1} [\tilde{a}], \dots, x_{k^{'}} [\tilde{a}]), P, P^{'})$ from the stack, stores the return value in $x_{1} [\tilde{a}], \dots, x_{k^{'}} [\tilde{a}]$ , and executes P. It adds to the set of callable oracles $Q$ the oracles $Q^{'}$ defined in the oracle definition $Q^{″}$ located after the return statement. The set $oracledefset (Q)$ contains all oracle definitions provided by the oracle definition Q, with replication indices instantiated to all their possible values, defined as follows: $\begin{array}{l} oracledefset (0) \overset{def}{=} \emptyset & (Nil) \\ oracledefset (Q_{1} ∣ Q_{2}) \overset{def}{=} oracledefset (Q_{1}) \cup oracledefset (Q_{2}) & (Par) \\ oracledefset (foreach i ⩽ n do Q) \overset{def}{=} ⋃_{a = 1}^{n} oracledefset (Q {a / i}) & (Repl) \\ oracledefset (O [\tilde{i}] (x_{1} [\tilde{i}] : T_{1}, \dots, x_{k} [\tilde{i}] : T_{k}) : = P) \\ \overset{def}{=} {O [\tilde{i}] (x_{1} [\tilde{i}] : T_{1}, \dots, x_{k} [\tilde{i}] : T_{k}) : = P} & (Oracle) \end{array}$ The notation $Q {a / i}$ means that we replace all occurrences of i by a in Q.

The rule (End) also pops an element $((x_{1} [\tilde{a}], \dots, x_{k^{'}} [\tilde{a}]), P, P^{'})$ from the stack, but executes the process $P^{'}$ . The rule (Event) adds the executed event to the list of events $E$ .

The rules (Loop1) and (Loop2) implement the $loop$ statement. The rule (Loop1) performs one iteration of the loop. To that effect, it creates two fresh variable names $r_{a_{1}^{'}, r}^{'}$ and $b_{a_{1}^{'}, r}$ , calls the oracle O and stores its return values in these variables. When the value $b_{a_{1}^{'}, r} [\tilde{a}]$ returned by O is $stop$ , that is, $false$ , it ends the loop and continues by executing P with the result $r [\tilde{a}]$ bound to the value of $r_{a_{1}^{'}, r}^{'} [\tilde{a}]$ . When $b_{a_{1}^{'}, r} [\tilde{a}]$ is $continue$ , that is, $true$ , it reruns the loop. If the oracle O terminates with an $end$ statement, it ends the loop and continues by executing $P^{'}$ . The rule (Loop2) handles the case in which the loop stops by reaching the bound $N_{1}$ of the loop index.

The initial configuration for running the oracle definition $Q_{0}$ is $C_{i} (Q_{0}) \overset{def}{=} \emptyset, let x [] : bitstring = O_{start} () in return (x) else end, T_{0}, oracledefset (Q_{0}), \emptyset, []$ , where $T_{0} (Tbl) = []$ for all tables $Tbl$ . This configuration starts by calling oracle $O_{start}$ . The oracle definition $Q_{0}$ typically contains a protocol in parallel with an adversary.

CryptoVerif verifies the following requirements on $Q_{0}$ .

Property 4.3.

Variables are renamed so that each variable has a single definition. The indices $\tilde{i}$ of a variable $x [\tilde{i}]$ are always the indices of replications above the definition of x.

Property 4.3 makes sure that a distinct array cell is used in each copy of a process, so that all values of the variables during execution are kept in memory. (This helps in cryptographic proofs.)

Property 4.4.

The processes are well typed. (In particular, functions and oracles receive arguments of their expected types. For brevity, we do not detail the type system; see [7] for a similar type system.)

Property 4.4 requires the adversary to be well typed. This requirement does not restrict its computing power, because well-typed processes are Turing-complete, since primitives can implement any deterministic Turing machine. The type system also does not restrict the class of protocols that we consider, since the protocol may contain type-cast functions $f : T \to T^{'}$ to bypass the type system. The type system just makes explicit which set of bitstrings may appear at each point of the protocol.

Property 4.5.

We define types of oracles as follows. The type of a $return (M_{1}, \dots, M_{k}); Q$ statement consists of the types of $M_{1}, \dots, M_{k}$ and the list of types of the oracle definitions at the beginning of Q, ordered from left to right. The type of an oracle definition consists of the oracle name, the bounds of the replications above that oracle definition, the types of the arguments of the oracle, and the common type of its return statements.

An oracle may have several $return$ statements, but they must be of the same type. When there are several definitions of an oracle with the same name O, they must be of the same type.

Property 4.5 guarantees that the various definitions of an oracle are consistent, and can in fact be compiled into a single function in OCaml. The oracles at the beginning of Q are the oracles found in Q without recursively looking into oracle definitions.

Property 4.6.

Oracles with the same name can be defined only in different branches of an $if$ or $get$ construct. In an oracle definition $O [\tilde{i}] (x_{1} [\tilde{i}] : T_{1}, \dots, x_{k} [\tilde{i}] : T_{k}) : = P$ , the indices $\tilde{i}$ are always the indices of replications above that oracle definition.

Property 4.6 guarantees that there exists a single callable definition for each oracle. This property is formalized by the following lemma, proved in Appendix A (see the Supplemental material).

Lemma 4.7 (Oracle name and indices unicity).

If the configuration $C = E, P, T, Q, S, E$ is reachable from the initial configuration $C_{i} (Q_{0})$ by reductions $\to_{p}$ , then the set of callable oracles $Q$ contains at most one oracle with a given name O and given replication indices $\tilde{a}$ .

This lemma proves that the rule (Call) is deterministic. Therefore, all rules are deterministic, except the rules (New) and (Get1) which may make probabilistic choices.

As a consequence, if a configuration $C$ is non-blocking (that is, $C \to_{p} C^{'}$ for some p and $C^{'}$ ), then the sum of the probabilities of all the possible reductions from $C$ is 1: $\sum_{{C^{'} ∣ C \to_{p (C^{'})} C^{'}}} p (C^{'}) = 1 .$

Definition 4.8 (Traces).

Let us denote traces with the symbol $C T$ . A trace is a sequence of reductions $C T = C_{0} \to_{p_{1}} \dots \to_{p_{n}} C_{n}$ where $C_{0}, \dots, C_{n}$ are semantic configurations such that $C_{i} \to_{p_{i + 1}} C_{i + 1}$ for $i = 0, \dots, n - 1$ .

A complete trace is a trace whose last configuration is blocking.

The probability of the trace $C T$ is $Pr [C T] = p_{1} \times \dots \times p_{n}$ . When no trace in a set of traces $C T S$ is a prefix of another, the probability of $C T S$ is the sum of the probabilities of its elements.

The notation $C \to_{p}^{*} C^{'}$ means that there exists a trace beginning at $C$ and ending at $C^{'}$ , and p is the probability of the set of all traces beginning at $C$ and stopping at their first occurrence of $C^{'}$ .

The notation $C \to_{p}^{+} C^{'}$ means that $C \to_{p}^{*} C^{'}$ and $C \neq C^{'}$ , that is, all traces from $C$ to $C^{'}$ have at least one step.

The notation $C \to^{*} C^{'}$ means $C \to_{1}^{*} C^{'}$ . We denote the number of steps in the trace $C T$ as $| C T | = n$ .

Intuitively, when no trace in $C T S$ is a prefix of another, the traces in $C T S$ correspond to disjoint cases, so the probability of $C T S$ is the sum of probabilities of the traces in $C T S$ . (When $C T$ is a prefix of ${C T}^{'}$ , the trace ${C T}^{'}$ is a particular case of $C T$ .) In the notation $C \to_{p}^{*} C^{'}$ , we consider the set $C T S$ of all traces beginning at $C$ and stopping at their first occurrence of $C^{'}$ . No trace in this set is a prefix of another: if a trace ${C T}_{1}$ was a prefix of ${C T}_{2}$ with both traces in $C T S$ , then ${C T}_{2}$ would contain $C^{'}$ in the middle, at the end of the prefix ${C T}_{1}$ , so it would not stop at the first occurrence of $C^{'}$ , which contradicts the definition of $C T S$ . Therefore, the probability $p = Pr [C T S]$ is well defined.

In CryptoVerif, since for every reduction with a probabilistic choice, the environment E is modified so that we can determine from E which reduction was used, and one cannot remove elements from E, there will be at most one trace from one configuration to another. However, the notations of Definition 4.8 are also used for OCaml where there could be several configurations reducing to the same configuration, so they support this situation.

Finally, the security properties are defined using distinguishers D which are functions that take a list of events $E$ and return true or false. We denote by $Pr [C :^{(CV)} D]$ the probability of the set of complete CryptoVerif traces starting at $C$ and such that the list of events $E$ in their last configuration satisfies $D (E) = true$ . We define D such that $D (E) = true$ if and only if $E$ does not satisfy the desired security property. We represent the adversary for $Q_{0}$ by any CryptoVerif process $Q_{adv}$ that does not contain events nor variables that occur in $Q_{0}$ . Then CryptoVerif bounds the probability $Pr [C_{i} (Q_{0} ∣ Q_{adv}) :^{(CV)} D]$ , that is, the probability that the adversary $Q_{adv}$ breaks the desired security property in $Q_{0}$ , for any adversary $Q_{adv}$ for $Q_{0}$ .

Example 4.9.
To show that the protocol $Q_{0}$ of Example 4.1 satisfies the correspondence c “for all $m^{'}$ , if $Baccepts (m^{'})$ has been executed, then $Asends (m^{'})$ has also been executed”, we define $D_{c}$ by $D_{c} (E) = true$ if and only if the correspondence does not hold, that is, $E$ contains $Baccepts (m^{'})$ but not $Asends (m^{'})$ for some $m^{'}$ . Then CryptoVerif shows that for all $Q_{adv}$ , $Pr [C_{i} (Q_{0} ∣ Q_{adv}) :^{(CV)} D_{c}] ⩽ {Succ}_{sign}^{uf - cma} (t + (N_{2} - 1) t_{check}, N_{1}),$ where t is the execution time of the adversary $Q_{adv}$ , $t_{check}$ is the maximum execution time of a call to $check$ , $N_{1}$ is the maximum number of calls to oracle $OA$ , $N_{2}$ is the maximum number of calls to oracle $OB$ , and ${Succ}_{sign}^{uf - cma} (t^{'}, n^{'})$ is the probability of forging a signature in time $t^{'}$ with at most $n^{'}$ calls to the signature oracle. When the signatures are UF-CMA, the probability ${Succ}_{sign}^{uf - cma} (t^{'}, n^{'})$ is small for reasonable values of $t^{'}$ and $n^{'}$ , then so is $Pr [C_{i} (Q_{0} ∣ Q_{adv}) :^{(CV)} D_{c}]$ , so the desired security property holds.

We can also define secrecy using events and distinguishers [8].
4.3. Annotations

In order to compile a CryptoVerif process into an implementation, we added annotations to the language, to specify implementation details.

First, we separate the parts of the process that correspond to different roles, such as client and server, which will be included in different OCaml programs in the generated implementation. We annotate processes to specify roles: the beginning of a role $role$ is specified by adding the annotation $role {$ just before the oracle definition $Q$ that starts $role$ , where $role$ is the name of a role, such as $alice$ or $bob$ ; the end of the role $role$ is specified by a closing brace $}$ between a $return (\dots)$ and its following oracle definition $Q^{'}$ . We denote by $Q (role)$ the part of the process corresponding to the role $role$ . A role can contain several oracles, and can thus represent a protocol participant that receives or sends several messages, for instance as follows: $role {O_{1} (x_{1} : T_{1}) : = \dots return (M_{2}); O_{2} (x_{3} : T_{3}) : = \dots return (M_{4})} .$ In this example, the role $role$ receives $x_{1}$ , replies with $M_{2}$ , then receives $x_{3}$ and replies with $M_{4}$ . The adversary schedules this exchange by calling $O_{1} (M_{1})$ , getting $M_{2}$ as answer, then calling $O_{2} (M_{3})$ , and getting $M_{4}$ as answer.

The process for a role $Q (role)$ may have free variables, but CryptoVerif requires that these free variables be defined under no replication, so that they can be passed from the process that defines them to the process $Q (role)$ , which uses them, simply by storing each variable in a file. (There must be a single value to store, not one for each value of the replication indices. Storing variables in files is useful for variables that are communicated across roles, for example long-term keys that are set in a key generation program and later used by client and server programs. Using files is not the only possible implementation: we only need an implementation that provides persistent storage and guarantees that only our generated code has access to stored data. In particular, the adversary must not have access to stored data.) The user must also declare, for each free variable $x []$ in a role, the file $file$ in which the variable will be stored. Let $Files$ be the set of these pairs $(x [], file)$ . Let also $Tables$ be the set of pairs $(Tbl, file)$ such that the table $Tbl$ will be stored in file $file$ .

Example 4.10.
Let us annotate the protocol of Example 4.1. $\begin{array}{rcl} keygen [pk > pkfile, sk > skfile] {Okeygen () : = \\ rk \overset{R}{\leftarrow} keyseed; pk \leftarrow pkgen (rk); sk \leftarrow skgen (rk); return (pk)}; \\ (foreach i_{1} ⩽ N_{1} do P_{A} ∣ foreach i_{2} ⩽ N_{2} do P_{B}) \\ P_{A} \overset{def}{=} alice {OA () : = m \overset{R}{\leftarrow} nonce; s \overset{R}{\leftarrow} seed; event Asends (m); return (m, sign (m, sk, s)) \\ P_{B} \overset{def}{=} bob {OB (m^{'} : nonce, s^{'} : signature) : = \\ if check (m^{'}, pk, s^{'}) then (event Baccepts (m^{'}); return ()) else end \end{array}$ We divide this process into three parts. First, the key generation part is represented by the role $keygen$ , which contains just the oracle $Okeygen$ . The annotation $pk > pkfile$ , $sk > skfile$ means that we store the public key $pk$ in the file $pkfile$ so that all replications of oracle $OB$ can access it, and analogously, we store the secret key $sk$ in the file $skfile$ so that all replications of oracle $OA$ can access it. In order words, $Files = {(pk [], pkfile), (sk [], skfile)}$ .

The role $alice$ , which contains the oracle $OA$ , corresponds to the role of Alice and the role $bob$ , which contains the oracle $OB$ , corresponds to the role of Bob. For these two roles, there is no need to write the closing braces } because there is nothing after them.

Finally, the user annotations provide, for each CryptoVerif type T, the corresponding OCaml type $G_{T} (T)$ as well as several OCaml functions:
The function $G_{random} (T) : unit \to G_{T} (T)$ generates random numbers uniformly in T (when T is used in a random number generation).

The serialization function $G_{ser} (T) : G_{T} (T) \to string$ converts an element of type $G_{T} (T)$ to an OCaml string. The deserialization function $G_{deser} (T) : string \to G_{T} (T)$ performs the inverse operation. When deserialization fails, it must raise the exception $Bad_file$ ; this exception is raised only when a file has been corrupted. These functions are present when values of type T are written or read from tables and files.

The predicate function $G_{pred} (T) : G_{T} (T) \to bool$ returns $true$ if its argument corresponds to an element of type T and $false$ otherwise (when T is present in the interface of the oracle definitions).
The user annotations also provide, for each CryptoVerif function $f : T_{1} \times \dots \times T_{m} \to T$ , a corresponding OCaml function $G_{f} (f) : G_{T} (T_{1}) \times \dots \times G_{T} (T_{m}) \to G_{T} (T)$ . We assume that these functions are all provided in an OCaml module $μ_{prim}$ .

CryptoVerif verifies the following properties.
Property 4.11.
There is a single occurrence of each role $role$ . If a role is defined after an oracle O, this oracle O must have globally at most one $return$ , and must be in a role.

This property guarantees that we know which process to compile for a given role, and which roles start after the return from a given oracle.
Property 4.12.
There are no nested roles.

Furthermore, for simplicity, we also assume the following points.
Assumption 4.13.
All oracle definitions are included in a role.

This assumption is relaxed in the implementation: we accept all processes in which all oracles in a role are not preceded by oracles not in a role. In practice, oracles outside a role serve in representing features, such as corruption of protocol participants or registration of dishonest participants, that are needed in the proof of the security property but not in the implementation of the protocol. For instance, in Example 4.2, the oracle $OR$ would typically not be included in a role. To extend our proof to the general case, if the process $Q_{0}$ does not satisfy Assumption 4.13, we transform it into a process $Q_{0}^{'}$ that satisfies Assumption 4.13 by adding roles or by continuing existing roles till the end of the process instead of terminating them. The generated OCaml modules for $Q_{0}^{'}$ contain unused OCaml code, which is not generated for $Q_{0}$ . It is fairly obvious that removing this code preserves the security of the implementation.
Assumption 4.14.
No replication occurs above a parallel composition or a replication. When the definition of a role $role$ is under replication $foreach i ⩽ N do role {Q$ , its contents Q consists of an oracle definition $O [\tilde{i}] (\dots) : = \dots$ or of a parallel composition of such oracle definitions (without replication).

A process can be transformed so that no replication occurs above a parallel composition by distributing the replications into the parallel compositions: $foreach i ⩽ N do (Q_{1} ∣ Q_{2})$ can be transformed into $(foreach i_{1} ⩽ N do Q_{1}) ∣ (foreach i_{2} ⩽ N do Q_{2})$ : both processes allow calling the oracles defined in $Q_{1}$ and $Q_{2}$ at most N times. We can encode nested replications by adding a dummy oracle between the two replications: the process $foreach i ⩽ N do foreach j ⩽ N^{'} do Q$ can be transformed into $foreach i ⩽ N do O () : = return (); foreach j ⩽ N^{'} do Q$ .

By Properties 4.6, 4.5 and 4.11, there cannot be, in the same process, a definition of an oracle O directly under replication and another definition of the same oracle O not directly under replication. Hence, we can use the phrase “O is under replication” unambiguously. Moreover, by Property 4.5, the bound of the replication above a definition of an oracle O is the same for all definitions of O.
Assumption 4.15.
For each oracle O under replication, we let $N_{O}$ be the bound of the replication above the definition of O. For each role $role$ under replication, we let $N_{role}$ be the bound of the replication above the definition of $role$ . All these bounds $N_{O}$ and $N_{role}$ are pairwise distinct.

After transforming the process so that it satisfies Assumption 4.14, we can transform it into a process that satisfies Assumption 4.15 by renaming the bounds of replications above distinct roles or oracles to distinct bounds. For instance, $(foreach i_{1} ⩽ N do Q_{1}) ∣ (foreach i_{2} ⩽ N do Q_{2})$ becomes $(foreach i_{1} ⩽ N_{1} do Q_{1}) ∣ (foreach i_{2} ⩽ N_{2} do Q_{2})$ . Using distinct bounds for each oracle and role allows us to be more precise when counting the number of times an oracle has been called.

Assumptions 4.14 and 4.15 are relaxed in our implementation: we warn the user when the process does not satisfy them, but we accept the process. Not heeding these warnings will lead to CryptoVerif returning imprecise, but sound, probabilities of security. We use these assumptions because they simplify the proof without losing much generality. Our result can be extended to the general case as follows: if the process $Q_{0}$ does not satisfy Assumptions 4.14 or 4.15, we transform it into a process $Q_{0}^{'}$ that satisfies these assumptions as outlined after each assumption. We apply our theorem to $Q_{0}^{'}$ and argue that the implementation generated from $Q_{0}$ is also secure since it is basically the same as the one generated from $Q_{0}^{'}$ .
5. The OCaml language

This section presents the OCaml language, the target language of our compiler, by giving its syntax and semantics. We omit some constructs, such as loops and type constructors, which are not used by our compiler. The subset that we consider is still Turing complete, so we do not lose expressivity by removing these constructs. To define the formal semantics, we adapted the small step operational semantics of the core part of OCaml by Scott Owens et al. [17,18].

5.1. Syntax and informal semantics

Figure 5 summarizes the syntax of our subset of OCaml. For brevity, we ignore types in this syntax.

Fig. 5.

OCaml syntax.

Pattern-matching is a central feature of OCaml. A pattern $pat$ describes the form of a value to be matched. When we match a value v with a pattern $pat$ , if the value is of the correct form, then we bind each variable x occurring in the pattern $pat$ to the corresponding part of v. Patterns must be linear, that is, no variable can occur more than once inside a pattern. When we match a value v with the pattern matching ${pat}_{1} \to e_{1} ∣ \dots ∣ {pat}_{n} \to e_{n}$ , we match v sequentially to the patterns ${pat}_{1}, \dots, {pat}_{n}$ . If the first pattern that matches v is ${pat}_{i}$ , then we evaluate $e_{i}$ . If no pattern matches v, then we raise the exception $Match_failure$ .

The basic operations of the language are implemented by primitives $prim$ . We write binary primitives in infix notation: for example, we write $v_{1} = v_{2}$ rather than $(=) v_{1} v_{2}$ . We consider the following primitives: $not$ is the boolean negation, $(=)$ is the equality test, $raise e$ raises the exception e. We use primitives to manage references, which are mutable memory cells. We represent memory cells by locations l; we also use special locations to represent files. The reference creation $ref v$ creates a new location l, store the value v in l, and returns the location l. The assignment $l : = v$ replaces the contents of the location l with the value v. The dereference $! l$ returns the contents of the location l. We also introduce a primitive for random number generation: $random ()$ returns a random boolean, $true$ or $false$ , with equal probability. This primitive was not present in [17,18]. It formalizes Assumption (A1) that our implementation uses a perfect random number generator. It makes the semantics probabilistic. The language also includes primitives to manage other native types such as integers (e.g., addition and multiplication) and strings (e.g., concatenation, extraction of substrings, and conversion between integers in ${0, \dots, 255}$ and one-character strings). Strings are immutable values in our semantics. In contrast, in OCaml, values of type string are mutable. Our strings could be implemented in OCaml as an abstract type, on which only operations that do not mutate strings are implemented.

Most expressions are standard. Constants c can be integers, strings, boolean values $true$ or $false$ , the empty list $[]$ , the unit constant $()$ , exceptions, and constant constructors. The expression $function pm$ defines a function. When this function is applied to a value v, it matches that value using the pattern matching $pm$ . The application $e_{1} e_{2}$ applies the function $e_{1}$ to the argument $e_{2}$ . The sequence operation $e_{1}$ ; $e_{2}$ evaluates $e_{1}$ , ignoring its result (but obviously keeping its side effects), then evaluates $e_{2}$ . The matching operation $match e with pm$ evaluates e and matches the result of e using the pattern matching $pm$ . The try construct $try e with pm$ returns the result of e if e does not raise exceptions; if e raises an exception v matched by a pattern in $pm$ , it returns the result of $match v with pm$ ; if e raises an exception v that is not matched by a pattern in $pm$ , it also raises the exception v. The let binding $let pat = e_{1} in e_{2}$ evaluates $e_{1}$ , matches the result with the pattern $pat$ , which binds the variables in $pat$ , and finally evaluates $e_{2}$ . When the pattern matching fails, it raises the exception $Match_failure$ . This construct is equivalent to $match e_{1} with pat \to e_{2}$ . The let rec binding $let rec x_{1} = function {pm}_{1} and \dots and x_{n} = function {pm}_{n} in e$ defines n mutually recursive functions $x_{1}, \dots, x_{n}$ , and evaluates the expression e using these functions.

Closures are not present in the initial program, but they serve to represent functional values internally. The closure $function [env, pm]$ comes from the function $function pm$ . It contains the code of the function ( $pm$ ), and an environment $env$ that maps the free variables of $pm$ to their values. Closures allow one to evaluate functions using the values that the free variables of the function had at the definition of the function. (In other words, OCaml uses static variable binding.) The let rec closure $letrec [env, {x_{1} \mapsto function {pm}_{1}, \dots, x_{n} \mapsto function {pm}_{n}} in x_{i}]$ is similar, but for mutually recursive functions. It records several mutually recursive bindings together.

A security protocol typically involves several programs running in parallel on different machines. We model this situation by considering several threads. To manage threads, we introduce two new expressions, $addthread (program)$ and $schedule (e)$ . The expression $addthread (program)$ creates a new thread that runs the program $program$ . The expression $schedule (e)$ stops execution of the current thread and continues execution of the thread number e. (Threads are designated by integer numbers. The initial thread, started at the beginning of the program, has number 1. The threads created by subsequent calls to $addthread$ have numbers starting at 2 and increasing by one each time a new thread is created.)

We define the list expression $[e_{1}; e_{2}; \dots; e_{n}]$ as syntactic sugar for $e_{1} : : (e_{2} : : \dots : : (e_{n} : : []) \dots)$ . The expression $e & & e^{'}$ is syntactic sugar for $if e then e^{'} else false$ , and $e | | e^{'}$ is syntactic sugar for $if e then true else e^{'}$ .

A program is a list of top level definitions d, or the raising of an exception. We omit the final ε in a sequence of definitions when it is not empty.

Expressions reduce into values or exceptional values. As summarized in Fig. 6, the values v are functional values like closures, constants c, locations l, and tuples and lists of values. An exceptional value is $raise v$ , where v is an exception value (a constant).

Fig. 6.

OCaml values.

5.2. Formal semantics

We define step by step the semantics of the various constructs of the language.

5.2.1. Pattern matching

We define the predicate $matches$ in Fig. 7: we have $v matches pat ⊳ env$ when the value v matches the pattern $pat$ , and the environment $env$ is a mapping from the variables of $pat$ to their values, computed by the pattern matching. The operation $env \oplus {env}^{'} \overset{def}{=} {env}_{|} \frac{}{Dom ({env}^{'})} \cup {env}^{'}$ adds the bindings of ${env}^{'}$ to those of $env$ ; when a variable is bound in both environments, the binding of ${env}^{'}$ is kept. Since patterns are linear, in Fig. 7, the operation $env \oplus {env}^{'}$ is always used with environments $env$ and ${env}^{'}$ that have disjoint domains; the general case is used below. We also define $v matches pat$ as $\exists env, v matches pat ⊳ env$ .

Fig. 7.

Matches predicate.

5.2.2. Primitives

The semantics of primitives is defined in Fig. 8. This semantics is defined by rules of the form $prim v_{1} \dots v_{n} {\overset{L}{⟶}}_{p} e$ where $prim$ is an n-ary primitive. Such a rule means that $prim v_{1} \dots v_{n}$ reduces to e with probability p. In contrast to [17,18], the semantics is probabilistic, because of the presence of the primitive $random$ . The probability p is omitted when it is 1. The label L is used to reflect the operations on locations. It is empty when locations are unaffected. The label $ref v = l$ means that a new location l is created, with contents v. The label $! l = v$ means that the current contents of location l is v. The label $l : = v$ means that the contents of the location l is changed into v. The rules are straightforward; they reflect the semantics defined informally in Section 5.1. One is not allowed to test equality between functional values, so we use the predicate $funval$ , also defined in Fig. 8, to test whether a value is functional, and raise the exception $Invalid_argument$ when we try to test equality between functional values. There is no rule for the primitive $raise$ : $raise v$ is an exceptional value, it does not reduce.

Fig. 8.

Rules for OCaml primitives.

5.2.3. Expressions and programs

The semantics of [17,18] substitutes variables with their values. Instead, we define an environment $env$ that maps variables to their values. This way, it is easier to relate the OCaml state to the CryptoVerif state which also contains an environment. Because of this change, we also need to add an explicit call stack $stack$ . The stack is a list of pairs $(env, C_{m})$ , where $C_{m}$ is a minimal evaluation context, that is, an expression with a hole $[\cdot]$ , such that the expression inside the hole can be immediately evaluated. We define a minimal evaluation context as: For example, we evaluate the argument of applications first, and when it becomes a value, we evaluate the function, so $e [\cdot]$ and $[\cdot] v$ are evaluation contexts. Tuples and lists are evaluated from right to left. We denote by $C_{me} [e]$ the context $C_{me}$ with the hole $[\cdot]$ replaced by e, and similarly for $C_{mp}$ . The stack contains a minimal program evaluation context $C_{mp}$ in the last element of the list and expression evaluation contexts $C_{me}$ in the other elements if it is non-empty.

Fig. 9.

Rules for expressions.

Fig. 10.

Rules for expressions (continued).

Fig. 11.

Rules for programs.

Hence, we evaluate expressions and programs by reducing triples $env$ , $pe$ , $stack$ , where $pe$ means program $program$ or expression e. The reduction rules $env, pe, stack {\overset{L}{⟶}}_{p} {env}^{'}, {pe}^{'}, {stack}^{'}$ are defined in Figs 9 and 10 for expressions and Fig. 11 for programs. The label L is defined above in Section 5.2.2. These reductions are probabilistic; the probability p is omitted when it is 1. Most rules are straightforward. In order to evaluate an expression $C_{me} [e]$ , we need to reduce e under the context $C_{me}$ . To do that, we push the context $C_{me}$ on the stack with the current environment by rule (Context in), evaluate the expression e until it becomes a value v, and finally pop the context $C_{me}$ from the stack by rule (Context out), inserting the obtained value v in $C_{me}$ , yielding $C_{me} [v]$ . In case the expression e raises an exception v, we use rules (Context raise1) and (Context raise2). If the context $C_{me}$ is not a $try$ context, the result of $C_{me} [e]$ is also $raise v$ by (Context raise1). If $C_{me}$ is a $try$ context, we evaluate that $try$ by (Context raise2), followed by (Try2). The rules (Let ctx in), (Let ctx out) and (Let ctx raise) play the same role as (Context in), (Context out) and (Context raise1) respectively, for programs instead of expressions: they allow reducing under the minimal program evaluation context $let pat = [\cdot];; definitions$ . There is no rule corresponding to (Context raise2) for programs because there is no $try$ program context.

Example 5.1.

Let us present as an example the reduction of a simple program in an empty environment and an empty stack: $\emptyset, let x = if random () then 0 else 1;;, [] .$ We first reduce the expression part of the $let$ , by keeping in the stack the fact that the expression is under the context $let x = [\cdot]$ . This expression reduces eventually to a value, and at this point we insert this value back into the context. So we first reduce the previous configuration by (Let ctx in) into: $\emptyset, if random () then 0 else 1, [(\emptyset, let x = [\cdot];;)]$ By (Context in), we prepare to reduce the condition of the $if$ : $\emptyset, random (), [(\emptyset, if [\cdot] then 0 else 1); (\emptyset, let x = [\cdot];;)]$ By (Random), $random ()$ reduces to $true$ with probability $1 / 2$ and $false$ with probability $1 / 2$ . For the purpose of the example, let us consider the case where $random ()$ reduces to $true$ . By (Primitives), the configuration reduces with probability $1 / 2$ into $\emptyset, true, [(\emptyset, if [\cdot] then 0 else 1); (\emptyset, let x = [\cdot];;)]$ By (Context out), we insert the value of the condition back into the $if$ : $\emptyset, if true then 0 else 1, [(\emptyset, let x = [\cdot];;)]$ By (If1), we evaluate the $if$ : $\emptyset, 0, [(\emptyset, let x = [\cdot];;)]$ By (Let ctx out), we insert the obtained value back into the context $let x = [\cdot];;$ $\emptyset, let x = 0;;, []$ By (Variable), we have that $0 matches x ⊳ {x \mapsto 0}$ . So, by (Let match1), the configuration reduces into the following last configuration: ${x \mapsto 0}, ε, []$

The expressions $addthread (program)$ and $schedule (e)$ are treated specially because they alter parts of the semantic configuration other than $env$ , $pe$ , $stack$ . Their treatment is detailed in Section 5.2.5.

Fig. 12.

Store rules.

5.2.4. Store

As usual, the contents of locations are stored in a $store$ , which maps locations to their current values. Figure 12 defines the relation $store \overset{L}{⟶} {store}^{'}$ . If a program or an expression reduces by $env, pe, stack {\overset{L}{⟶}}_{p} {env}^{'}, {pe}^{'}, {stack}^{'}$ , then the store $store$ will be updated into ${store}^{'}$ such that $store \overset{L}{⟶} {store}^{'}$ . When L is empty, the store is unchanged by rule (Store empty). When L is $! l = v$ , the store is also unchanged, but the reduction succeeds only when the contents of l is v, by rule (Store lookup). When L is $l : = v$ , the store is updated so that l contains v, by rule (Store assign). When L is $ref v = l$ , a new location l is created with contents v, so the contents of l must not be defined in the initial store, by rule (Store alloc).

5.2.5. Toplevel reduction

As mentioned in Section 5.1, and in contrast to [17,18], we consider several threads running in parallel. Each thread has a configuration ${th}_{i} = ⟨ {env}_{i}, {pe}_{i}, {stack}_{i}, {store}_{i} ⟩$ that contains the current ${env}_{i}$ , ${pe}_{i}$ , ${stack}_{i}$ as explained in Section 5.2.3, as well as the contents of locations local to this thread, in a store ${store}_{i}$ , as explained in Section 5.2.4. The complete semantic configuration is then $C = [{th}_{1}, \dots, {th}_{n}], globalstore, tj$ where $tj$ is the number of the thread currently being executed, and $globalstore$ is a store for locations shared between threads. We use it to model the communication between threads by storing messages in global locations, and to store the files containing private data from the CryptoVerif process (free variables of roles and tables). In practice, these files may be copied from one machine to another by the user, so they are actually shared between several threads. The values in the global store contain no closure and no reference. (In OCaml, closures and references can be written to a file only by marshalling, but marshalling is ruled out by Assumption (A5), since it may bypass the type system.) The global store contains locations in a set ${Loc}_{g}$ , while the local stores contain locations in an infinite set ${Loc}_{ℓ}$ , with ${Loc}_{g} \cap {Loc}_{ℓ} = \emptyset$ .

Fig. 13.

Top level rules.

The reduction rules for semantic configurations $C$ are defined in Fig. 13. Actually, this figure defines three relations. The relation $th \to_{p} {th}^{'}$ , defined by rule (Thread), handles all operations that deal with the current thread only. It updates the store using the same label L as the one used for evaluating the program or the expression, and it checks that this label concerns the local store of the thread. (The location l, if any, must be in ${Loc}_{ℓ}$ .)

Second, the relation $th, globalstore \to_{p} {th}^{'}, {globalstore}^{'}$ , defined by rules (Globalstore1) and (Globalstore2), handles all operations local to one thread and the global store operations. By rule (Globalstore1), it uses the relation $th \to_{p} {th}^{'}$ to handle the operations local to one thread. By rule (Globalstore2), it handles the global store operations. It updates the global store using the same label L as the one used for evaluating the program or the expression, and it checks that this label concerns the global store. The location l must be in ${Loc}_{g}$ , and the creation of a location in the global store is forbidden. (Otherwise, one would need a way to tell the system whether a new location should be created in the local or in the global store, and to communicate the global locations to the other threads.) We assume that all locations of the global store are initialized at the beginning of the program.

Finally, the relation $C \to_{p} C^{'}$ , defined by the last four rules of Fig. 13, gives the semantics of the full language. Rule (Toplevel) runs the current thread $tj$ , using the relation $th, globalstore \to_{p} {th}^{'}, {globalstore}^{'}$ . Rule (Toplevel add thread) defines the semantics of $addthread (program)$ : it creates a new thread that runs the program $program$ , with empty environment, stack and store. Rules (Toplevel schedule1) and (Toplevel schedule2) define the semantics of $schedule$ : $schedule ({tj}^{'})$ schedules thread number ${tj}^{'}$ when this thread exists, and otherwise it raises the exception $Invalid_argument$ .

Splitting the definition of the semantics into three relations allows us to lighten notations in proofs: we can use the reduction on a thread, or on a thread and the global store, without mentioning the other components when they are not affected.

The construct $addthread$ does not allow using the same local store in several threads, which corresponds to forbidding fork in the middle of a role, as mentioned in Assumption (A7). Moreover, we reduce only the active thread, and we change threads only with $schedule$ . Since neither the primitives nor the generated modules use $schedule$ , thread scheduling is entirely under the control of the adversary. This seemingly restrictive semantics, in which only one thread is active at a time and oracle calls cannot be interleaved with other threads, is justified for two reasons.

First, it is sufficient to represent all program executions under the weaker assumption that two threads that read or write the same file are not run concurrently. Indeed, two oracles can interfere with each other only through files, and such interferences are forbidden by this assumption. Hence, by swapping execution steps, a trace that obeys this assumption with any interleaving of the oracle calls can be transformed into an equivalent trace in which the oracle calls are never interrupted, that is, a trace that can be scheduled in our semantics.

Second, it resembles the CryptoVerif semantics, which also has a single active thread and processes one oracle call after the other. This point facilitates the proof of our compiler.

5.2.6. Modules

OCaml programs typically contain several modules. We adopt a very simple model of modules. A module named μ consists of an OCaml program $program (μ)$ and its interface $interface (μ)$ that is the set of OCaml identifiers defined in μ and usable in other modules. The program $program (μ)$ initializes the module μ and makes available the identifiers defined in the interface of the module μ. When needed to distinguish identifiers coming from different modules, we use identifiers of the form $μ . x$ for variables defined in module μ. A correct OCaml program is then of the form $program = program (μ_{1});; \dots;; program (μ_{n});;$ , where, for all $i ⩽ n$ , the free variables of $program (μ_{i})$ are defined in the interfaces of $μ_{j}$ with $j < i$ , and $program (μ_{i})$ is a list of definitions. (The initial program of a module is never $raise e$ , but it may reduce into $raise e$ during execution.)

Such a program is run by using the previous reduction rules from the initial configuration $C_{0} (program) = [⟨ \emptyset, program, [], \emptyset ⟩], {globalstore}_{0}, 1$ where ${globalstore}_{0} = {l \mapsto {initval}_{l} ∣ l \in {Loc}_{g}}$ is the initial value of the global store, and ${initval}_{l}$ is the default value for location l: the empty list $[]$ for lists, the empty string "" for strings, 0 for integers, $false$ for booleans. Values in the global store cannot contain locations and closures, so we do not define a default value for them. The program $program$ does not contain closures nor locations in ${Loc}_{ℓ}$ , but may contain locations in ${Loc}_{g}$ . (Closures are created by $function$ and $let rec$ ; locations in ${Loc}_{ℓ}$ are created by $ref$ .)

Although we ignore types is our syntax, we suppose that our OCaml programs are well typed, which is checked by the OCaml compiler, and we use the guarantee that well-typed programs do not go wrong: a program stops only when the current thread has been reduced into the empty definition list or an exception $raise v$ (with the empty stack).

5.2.7. Equivalence modulo renaming of locations

The rule (Store alloc) is non-deterministic, since the new location l can be any unused location in ${Loc}_{ℓ}$ . To remove this non-determinism, we consider equivalence classes of OCaml semantic configurations modulo renaming of locations in ${Loc}_{ℓ}$ . We still denote these equivalence classes as OCaml configurations $C$ , and denote an equivalence class by one of its members. On these equivalence classes, the semantics is purely probabilistic. (There is no non-deterministic choice.) If a configuration $C$ can reduce, then the sum of the probabilities of all possible reductions is 1: $\sum_{{C^{'} ∣ C \to_{p (C^{'})} C^{'}}} p (C^{'}) = 1 .$ Moreover, for each reduction $C \to_{p} C^{'}$ , we have $p > 0$ .

We will also use notations similar to Definition 4.8 for the OCaml semantics. We denote by $C T$ an OCaml trace, $C T S$ a set of OCaml traces, and we also use the notation $\to^{*}$ for reductions with several steps.

6. Instrumentation of the OCaml semantics

In order to prove the correctness of our compiler, we instrument OCaml code in three ways; this section details this instrumentation and proves that it does not alter the semantics of OCaml.

Fig. 14.

Semantics of tagged functions.

First, we add a new kind of functions and closures $tagfunction$ that behave exactly in the same way as regular functions and closures, but are labeled with additional tags. We use these tagged functions to differentiate functions coming from our generated code and functions coming from the adversary. Hence, we add two new expressions ${tagfunction}^{t} pm$ for tagged functions and ${tagfunction}^{t, τ} [env, pm]$ for the corresponding closures. We also add ${tagfunction}^{t, τ} [env, pm]$ to the values. The tag t indicates the origin of the function or closure; it will be an oracle name or a role name, indicating that the function implements this oracle or role. The tag τ is a fresh tag generated when the function is reduced into a closure: each new closure gets a different tag, so that two closures are the same if and only if they have the same tag. This property will be used in Section 8 to count the number of calls to the same closure. The semantic rules for tagged functions are given in Fig. 14. They are the same as those for ordinary functions, except for the addition of tags. Much like for locations, we consider traces modulo renaming of tags τ, so that the choice of a fresh tag τ in (Tagged closure) does not lead to non-determinism. The condition that τ is fresh in this rule means that τ is distinct from all tags previously used in the considered trace.

Second, we need to be able to match CryptoVerif events, so we add to the semantic configuration an element $events$ that contains the list of the events executed until now. We add the expression $event e v (e_{1}, \dots, e_{k})$ that adds the event $e v (v_{1}, \dots, v_{k})$ to $events$ when $e_{1}, \dots, e_{k}$ evaluate to the values $v_{1}, \dots, v_{k}$ , respectively. We consider a new minimal expression evaluation context $event ev (e_{1}, \dots, e_{i - 1}, [\cdot], v_{i + 1}, \dots, v_{n})$ , for evaluating the arguments of events via rules (Context in) and (Context out), and for handling exceptions inside events via rule (Context raise1). Events serve in specifying security properties of protocols, so they appear in generated code, but cannot be used by the adversary.

Fig. 15.

Updated toplevel rules for the instrumented semantics.

Third, the roles of a CryptoVerif process cannot be executed in any order: if a role is defined after the return from an oracle, it can be executed only after the previous oracle has returned. For instance, we can run a server only after generating its keys. We need to enforce this constraint also in the OCaml program. Each CryptoVerif role $role$ is translated by our compiler into an OCaml module $μ_{role}$ . We add to the OCaml configuration the multiset of callable modules $M I$ that contains pairs $(μ_{role}, γ)$ of a module $μ_{role}$ and a flag $γ \in {Once, Any}$ , indicating, if $Once$ , that the module can be called only once and if $Any$ that the module can be called any number of times. Hence, the instrumented semantic configuration is $C I = [{th}_{1}, \dots, {th}_{n}], globalstore, tj, M I, events$ We adapt the toplevel semantic rules to this configuration as shown in Fig. 15. The instrumented semantic rules (New toplevel), (New toplevel schedule1) and (New toplevel schedule2) are straightforwardly adapted from the corresponding rules in the non-instrumented semantics by adding the components $M I$ , $events$ . The rule (Toplevel event) gives the semantics of $event$ : it adds its argument $ev (v_{1}, \dots, v_{n})$ to the list $events$ in the configuration and returns $(v_{1}, \dots, v_{n})$ . The rule (New toplevel add thread) gives the instrumented semantics of $addthread$ : the $addthread$ construct is modified to reject new programs that contain a module that cannot be called. We let $M_{g}$ be the set of generated modules. The programs spawned by $addthread$ can be of two forms. Either they are attacker programs that contain neither the module corresponding to the primitives $μ_{prim}$ nor any generated module in $M_{g}$ , or they are protocol programs that first contain the module corresponding to the primitives $μ_{prim}$ , then the necessary generated modules $μ_{1}, \dots, μ_{l}$ in $M_{g}$ , and finally any non-generated program ${program}^{'}$ . (We require this order on the modules for simplicity.) The generated modules $μ_{1}, \dots, μ_{l}$ must be callable according to the value of $M I$ . The modules $μ_{1}, \dots, μ_{l}$ that can be called only once are removed from the callable modules by removing the multiset ${M I}^{'}$ from $M I$ .

We also add the expression $return ({M I}^{'}, e)$ that adds to the multiset $M I$ the generated modules present in ${M I}^{'}$ , and returns the result of e, as defined by rule (Toplevel return). This expression is useful to add modules newly defined at the return from an oracle. We also add the minimal expression evaluation context $return (M I, [\cdot])$ to be able to evaluate the second argument of $return$ .

Let us now show that this instrumentation does not alter the semantics of OCaml: an instrumented program behaves exactly in the same way as that program with the instrumentation deleted, provided only allowed roles are executed, as assumed by Assumption (A3). This assumption is formalized as follows:

Assumption 6.1 (Only allowed roles).

The instrumented $addthread$ rule (New toplevel add thread) never fails.

We first show that, when a program or expression is a value v or an exceptional value $raise v$ , the environment does not matter. To prove this property, we define the following equivalence.

Definition 6.2.
We define the equivalence $\approx_{v th}$ on threads by $⟨ env, pe, stack, store ⟩ \approx_{v th} ⟨ {env}^{'}, {pe}^{'}, {stack}^{'}, {store}^{'} ⟩$ if and only if $pe, stack, store = {pe}^{'}, {stack}^{'}, {store}^{'}$ , and if $pe$ is not a value v or an exceptional value $raise v$ , then $env = {env}^{'}$ .

We extend this equivalence to non-instrumented configurations $C$ and $C^{'}$ by $C \approx_{v} C^{'}$ if and only if
$C = [{th}_{1}, \dots, {th}_{n}], globalstore, tj$ ,

$C^{'} = [{th}_{1}^{'}, \dots, {th}_{n}^{'}], globalstore, tj$ ,

$\forall {tj}^{'} ⩽ n$ , ${th}_{{tj}^{'}} \approx_{v th} {th}_{{tj}^{'}}^{'}$ .

We first show that configurations equivalent by $\approx_{v}$ reduce in the same way.
Lemma 6.3.
If $C \approx_{v} C^{'}$ and $C \to_{p} C^{″}$ , then $C^{'} \to_{p} C^{‴}$ and $C^{″} \approx_{v} C^{‴}$ .

We prove this lemma in Appendix B (see the Supplemental material).

Let us now define the function ${noinstr}_{C I}$ that takes a configuration in the instrumented semantics and returns the corresponding configuration in the non-instrumented semantics.
Definition 6.4.
The function ${noinstr}_{th 1}$ applied to a thread replaces
every $return (M I, e)$ with e,

every $event ev (e_{1}, \dots, e_{n})$ with $(e_{1}, \dots, e_{n})$ ,

and all $tagfunction$ functions and closures with regular ones
in this thread.

The function ${noinstr}_{th 2}$ modifies the stack of the thread by
removing any pair of the form $(env, return (M I, [\cdot]))$ ,

and transforming each pair of the form $(env, event ev (e_{1}, \dots, e_{i - 1}, [\cdot], v_{i + 1}, \dots, v_{n}))$ into the pair $(env, (e_{1}, \dots, e_{i - 1}, [\cdot], v_{i + 1}, \dots, v_{n}))$ .

Let ${noinstr}_{th} \overset{def}{=} {noinstr}_{th 1} \circ {noinstr}_{th 2}$ .

Finally, let us define $\begin{array}{rcl} {noinstr}_{C I} ([{th}_{1}, \dots, {th}_{n}], globalstore, tj, M I, events) \\ \overset{def}{=} [{noinstr}_{th} ({th}_{1}), \dots, {noinstr}_{th} ({th}_{n})], globalstore, tj \end{array}$

We do not need to replace elements of the global store, as they cannot contain closures: $event$ , $return$ , and tagged functions cannot appear in them.

The next proposition shows that, with Assumption 6.1, there is a weak bisimulation between the non-instrumented semantics and the instrumented semantics, that is, the reductions match in the two semantics, but the number of steps may differ. Indeed, the $return$ and $event$ expressions introduce an additional transition in the instrumented semantics. All other constructs reduce in the same number of steps in both semantics. Hence, the instrumentation does not alter the semantics of the language. This result is proved in Appendix B (see the Supplemental material).
Proposition 6.5.

If $C \approx_{v} {noinstr}_{C I} (C I)$ and $C_{1}, \dots, C_{n}$ are pairwise distinct configurations such that for all $i ⩽ n$ , we have $C \to_{p_{i}} C_{i}$ with $\sum_{i ⩽ n} p_{i} = 1$ , then there exist pairwise distinct instrumented configurations ${C I}_{1}, \dots, {C I}_{n}$ such that for all $i ⩽ n$ , we have $C I \to_{p_{i}}^{} {C I}_{i}$ and* $C_{i} \approx_{v} {noinstr}_{C I} ({C I}_{i})$ .

If $C \approx_{v} {noinstr}_{C I} (C I)$ and ${C I}_{1}, \dots, {C I}_{n}$ are pairwise distinct instrumented configurations such that for all $i ⩽ n$ , we have $C I \to_{p_{i}} {C I}_{i}$ with $\sum_{i ⩽ n} p_{i} = 1$ , then there exist pairwise distinct configurations $C_{1}, \dots, C_{n}$ such that for all $i ⩽ n$ , we have $C \to_{p_{i}}^{} C_{i}$ and* $C_{i} \approx_{v} {noinstr}_{C I} ({C I}_{i})$ .

In the rest of the paper, we use only the instrumented semantics. Furthermore, we denote instrumented configurations by $C$ to lighten notations.
7. Translation

In this section, we describe how our compiler translates an annotated CryptoVerif process. It translates each CryptoVerif role $role$ into an OCaml module $μ_{role}$ and each CryptoVerif oracle into a function. Let $G_{var}$ be an injective function that takes a CryptoVerif variable name and returns an OCaml variable name.

Let us recall that the function $G_{f} (f)$ , defined in Section 4.3, returns the name of the OCaml function corresponding to the CryptoVerif function f. The function $G_{M}$ transforms a CryptoVerif term M into an OCaml term. It is defined as follows: $\begin{array}{l} G_{M} (x [\tilde{i}]) \overset{def}{=} G_{var} (x) & (Variable) \\ G_{M} (f (M_{1}, \dots, M_{m})) \overset{def}{=} G_{f} (f) (G_{M} (M_{1})), \dots, (G_{M} (M_{m})) & (Function call) \end{array}$ The OCaml code generated by this definition matches the semantics of CryptoVerif terms given in Fig. 3.

Before defining the translation of an oracle, let us first introduce some notations. For each CryptoVerif variable x, we denote by $T_{x}$ the type of x, and by extension, for each CryptoVerif term M, we denote by $T_{M}$ the type of M. More precisely, if M is the variable x, then $T_{M} \overset{def}{=} T_{x}$ , and if M is a function application with a function of type $T_{1} \times \dots \times T_{n} \to T$ , then $T_{M} \overset{def}{=} T$ .

We say that an oracle or role definition occurs at the beginning of Q when it is found in Q just under replication or parallel composition, without recursively looking into oracle definitions. We define the function $oracledeflist$ that returns a description of the oracles made available by an oracle definition Q. In more detail, $oracledeflist (Q)$ is a list $[(Q_{1}, γ_{1}), \dots, (Q_{l}, γ_{l})]$ such that $Q_{1}, \dots, Q_{l}$ are the oracle definitions at the beginning of Q, from left to right, and $γ_{l}$ is $Any$ when $Q_{l}$ is under replication, and $Once$ otherwise. In this function, the replication indices $\tilde{i}$ can be partially instantiated into integer values. In contrast to the function $oracledefset$ , $oracledeflist (foreach i ⩽ n do Q)$ does not instantiate the replication index i. $\begin{array}{l} oracledeflist (0) \overset{def}{=} [] & (Nil) \\ oracledeflist (Q_{1} ∣ Q_{2}) \overset{def}{=} oracledeflist (Q_{1}) @ oracledeflist (Q_{2}) & (Par) \\ oracledeflist (foreach i ⩽ n do Q) \overset{def}{=} [(Q_{1}, Any), \dots, (Q_{l}, Any)] \\ when oracledeflist (Q) = [(Q_{1}, γ_{1}), \dots, (Q_{l}, γ_{l})] for some γ_{1}, \dots, γ_{l} & (Repl) \\ oracledeflist (O [\tilde{i}] (x_{1} [\tilde{i}], \dots, x_{k} [\tilde{i}]) : = P) \overset{def}{=} [(O [\tilde{i}] (x_{1} [\tilde{i}], \dots, x_{k} [\tilde{i}]) : = P, Once)] & (Oracle) \\ oracledeflist (role {Q) \overset{def}{=} [] & (Role) \end{array}$ The function $oracledeflist$ takes processes Q that follow $return$ statements that do not end a role. By Assumption 4.13, we are inside a role, so by Property 4.12, the construct $role {Q^{'}$ cannot appear in Q before a $return$ statement that ends the current oracle. So, the function $oracledeflist$ will never be called on $role {Q^{'}$ .

We also define the function $G_{getMI}$ that returns a description of the modules that correspond to roles defined at the beginning of an oracle definition Q. The function $G_{getMI}$ is similar to the function $oracledeflist$ above: it returns pairs containing the module generated for the role and a boolean indicating whether the role is under replication or not. In contrast to $oracledeflist$ , it returns a set and not a list. $\begin{array}{l} G_{getMI} (0) \overset{def}{=} \emptyset & (Nil) \\ G_{getMI} (Q_{1} ∣ Q_{2}) \overset{def}{=} G_{getMI} (Q_{1}) \cup G_{getMI} (Q_{2}) & (Par) \\ G_{getMI} (foreach i ⩽ n do Q) \overset{def}{=} {(μ, Any) ∣ \exists γ, (μ, γ) \in G_{getMI} (Q)} & (Repl) \\ G_{getMI} (O [\tilde{i}] (x_{1} [\tilde{i}], \dots, x_{k} [\tilde{i}]) : = P) \overset{def}{=} \emptyset & (Oracle) \\ G_{getMI} (role {Q) \overset{def}{=} {(μ_{role}, Once)} & (Role) \end{array}$ The function $G_{getMI}$ takes processes Q that follow $return$ statements that end the current role. By Assumption 4.13, there cannot be an oracle definition outside a $role {Q^{'}$ in Q. So the function $G_{getMI}$ will never be called on oracle definitions.

Fig. 16.

Translation function $G$ of an oracle body in OCaml.

Fig. 17.

Translation of an oracle.

To translate an oracle, we translate the body of the oracle using the function $G$ defined in Fig. 16. Most cases are straightforward: the function $G$ generates OCaml code that encodes the semantics of oracle bodies given in Figs 3 and 4. After defining a variable, we store it in a file if needed, using $G_{file} (x [\tilde{i}])$ , defined by $G_{file} (x [\tilde{i}]) \overset{def}{=} (f : = G_{ser} (T_{x}) G_{var} (x))$ if $(x [\tilde{i}], f) \in Files$ and $G_{file} (x [\tilde{i}]) \overset{def}{=} ()$ otherwise. A file is modeled by a global store location.

For the $return$ case, if the $return$ is not at the end of a role (i.e., there is an oracle in the same role following it), we return the closures corresponding to the oracles defined after the $return$ , as defined in ( $Return1$ ). (The function $G_{O}$ is defined in Fig. 17.) Otherwise, we update the set of available roles using the $return$ expression introduced in Section 6, as defined in ( $Return2$ ).

In the $insert$ case, we add the inserted element to the considered table $Tbl$ contained in the global store location f. In the $get$ case, we read the table by $read_table$ , keeping only the elements that satisfy the required condition (which is tested by $G_{test}$ ). These elements are stored in the list l. If l is empty, we run $P^{'}$ ; otherwise, we choose a random element in l and run P with that element. To choose that element, we use a function ${random}_{ℓ}$ such that ${random}_{ℓ} l$ returns a random element of the list l, such that the probability of returning the jth element of l is $almostunif ({1, \dots, | l |}, j)$ . We assume that this function is programmed using the OCaml primitive $random$ , and is present in the module for cryptographic primitives $μ_{prim}$ .

An oracle $O (x_{1}, \dots, x_{n}) : = P$ is transformed into a closure by the function $G_{O}$ as shown in Fig. 17. When the oracle O is not under replication (the second argument of $G_{O}$ is $Once$ , in ( $Oracle1$ )), we use a token $token$ to make sure that it can be called only once. This token can take the values $Callable$ and $Invalid$ . It is initially set to $Callable$ , and it is set to $Invalid$ in the first call. In subsequent calls, the exception $Bad_Call$ will be raised. The translation of an oracle always checks that the arguments are correct values for their CryptoVerif types, and stores them in files if necessary by calling $G_{file}$ .

Finally, we generate an OCaml module $μ_{role}$ for each role $role$ in the CryptoVerif process. This module provides a single function $init$ , which returns the functions implementing the oracles defined at the beginning of $Q (role)$ , so its interface is $interface (μ_{role}) \overset{def}{=} {μ_{role} . init}$ and its program is $\begin{array}{rcl} program (μ_{role}) \overset{def}{=} let μ_{role} . init = let token = ref Callable in {tagfunction}^{role} {pm}_{role} \\ where {pm}_{role} \overset{def}{=} () \to \\ if (! token = Callable) then \\ (token : = Invalid; \\ G_{read} (x_{1} []) in \dots in G_{read} (x_{m} []) in \\ (G_{O} (Q_{1}, γ_{1}), \dots, G_{O} (Q_{k}, γ_{k}))) \\ else raise Bad_Call \end{array}$ where $[(Q_{1}, γ_{1}), \dots, (Q_{k}, γ_{k})] = oracledeflist (Q (role))$ and $x_{1} [], \dots, x_{m} []$ are the free variables of $Q (role)$ , which are the variables we need to retrieve from the files. The function $G_{read} (x [])$ , which reads the contents of the file associated to $x []$ , is defined by $G_{read} (x []) \overset{def}{=} let G_{var} (x) = G_{deser} (T_{x}) (! f)$ if $(x [], f) \in Files$ .

Example 7.1.

Let us explain the translation of the role $keygen$ described in Examples 4.1 and 4.10. This role contains the following oracle $Okeygen () : = rk \overset{R}{\leftarrow} keyseed; pk \leftarrow pkgen (rk); sk \leftarrow skgen (rk); return (pk)$

This role is translated into the module $μ_{keygen}$ . Its program $program (μ_{keygen})$ is:

$let μ_{keygen} . init = let token = ref Callable in {tagfunction}^{keygen} () \to$

$if (! token = Callable) then$

$(token : = Invalid$ ;

$let token = ref Callable in {tagfunction}^{Okeygen} () \to$

$if (! token = Callable) then$

$(token : = Invalid$ ;

$let G_{var} (rk) = G_{random} (keyseed) () in$

$let G_{var} (pk) = G (pkgen) G_{var} (rk) in$

$pkfile : = G_{ser} (T_{pk}) G_{var} (p k)$ ;

$let G_{var} (sk) = G (skgen) G_{var} (rk) in$

$skfile : = G_{ser} (T_{sk}) G_{var} (s k)$ ;

$return ({(μ_{alice}, Any), (μ_{bob}, Any)}, G_{var} (pk)))$

$else raise Bad_Call)$

$else raise Bad_Call$

This program defines the function $μ_{keygen} . init$ , which expects $()$ as argument and returns the function that implements oracle $Okeygen$ . This function itself expects $()$ as argument and returns the OCaml representation of the public key $pk$ returned by $Okeygen$ .

The function $μ_{keygen} . init$ can be called only once, which is guaranteed using a reference $token$ to either $Callable$ or $Invalid$ . If $token$ is already $Invalid$ , then this function has already been called, so we raise the exception $Bad_Call$ . Otherwise, we set $token$ to $Invalid$ and we continue by the translation of the CryptoVerif oracle $Okeygen$ . The oracle $Okeygen$ can also be called only once. So we define a new reference $token$ , to guarantee this property, and define the function that implements $Okeygen$ . When this function is called for the first time, it sets this second $token$ to $Invalid$ , creates a new key seed $G_{var} (rk)$ , computes the keys $pk$ and $sk$ and stores them into files, modeled by the global store references $pkfile$ and $skfile$ . Finally, it returns the public key $G_{var} (pk)$ . Since the oracle $Okeygen$ ends the role $keygen$ , and is followed by the roles $alice$ and $bob$ , we update the set of callable modules $M I$ with the newly defined modules $μ_{alice}$ and $μ_{bob}$ , which can be called any number of times, using the $return$ expression. When this function is called again, it raises the exception $Bad_Call$ .

To call the translation of oracle $Okeygen$ , one can execute: $μ_{keygen} . init () ()$ This code first initializes the role by calling $μ_{keygen} . init ()$ , which returns a closure corresponding to the translation of $Okeygen$ , and then calls this closure.

The generated modules $M_{g}$ ( $μ_{role}$ for each $role$ in the CryptoVerif process) are included in manually-written programs that represent the full implementation of the protocol, for instance a client and a server. In particular, these programs are responsible for sending the result of oracles to the network and receiving messages to be passed as arguments to oracles. These programs interact with an adversary that we model as an OCaml program ${program}_{0}$ . We consider that the programs of the protocol are launched by the adversary ${program}_{0}$ using the $addthread$ construct. The generated modules depend only on the module containing the cryptographic primitives $μ_{prim}$ , so when the program of a thread uses the primitives or the generated modules, we can order the programs of the modules in the argument of $addthread$ in the order $program (μ_{prim});; program (μ_{{role}_{1}});; \dots;; program (μ_{{role}_{k}});; {program}^{'}$ where ${program}^{'}$ contains no generated module, as required by the instrumented semantics of $addthread$ (New toplevel add thread). We assume that ${program}_{0}$ uses the generated modules only inside $addthread$ . Moreover, the network code is well typed by Assumption (A5). Well-typed OCaml with $random$ is probabilistic Turing complete, so the adversary itself can be implemented by a well-typed OCaml program. Therefore, we can assume that ${program}_{0}$ is a well-typed OCaml program. (Our OCaml programs include $random$ and exclude type-casting and other constructs that allow to bypass the type system, as defined in Section 5.) Only the generated modules use events, tagged functions and $return$ . The adversary must not use events, which serve for specifying security properties of the protocol, nor $return$ , which serves for updating the set of callable generated modules. He uses regular functions rather than tagged functions. Moreover, as mentioned in Assumption (A4), we suppose that only the generated modules access files that contain private CryptoVerif data (free variables of roles and tables). So we let ${Loc}_{priv} \overset{def}{=} {f ∣ (x [], f) \in Files or (Tbl, f) \in Tables} \subseteq {Loc}_{g}$ be the set of global locations reserved for private CryptoVerif data, and we have the following assumption.

Assumption 7.2.

The locations in ${Loc}_{priv}$ occur only in the programs of generated modules; they do not occur elsewhere in ${program}_{0}$ .

The program ${program}_{0}$ is run in the initial (instrumented) OCaml configuration $C_{0} (Q_{0}, {program}_{0})$ defined as follows: $C_{0} (Q_{0}, {program}_{0}) \overset{def}{=} [⟨ \emptyset, {program}_{0}, [], \emptyset ⟩], {globalstore}_{0}, 1, G_{getMI} (Q_{0}), []$ where $G_{getMI} (Q_{0})$ is the set of modules available at the beginning of the execution and ${globalstore}_{0} \overset{def}{=} {l \mapsto {initval}_{l} ∣ l \in {Loc}_{g}}$ is the initial value of the global store as defined in Section 5.2. Tables are represented by lists, and their initial value ${initval}_{l}$ is the empty list $[]$ , representing that the tables are initially empty. Files that contain free variables of roles are represented by strings, and their initial value ${initval}_{l}$ is the empty string "". For other elements, the initial value ${initval}_{l}$ is the default value for the type of location l.

Example 7.3.

Let us consider the following toy OCaml program ${program}_{0}$ , which uses the translation from Example 7.1 of the process given in Example 4.10. $\begin{array}{rcl} let_= \\ addthread (program (μ_{prim});; program (μ_{keygen});; \\ let_= pkg : = μ_{keygen} . init () (); schedule (1);;); \\ schedule (2);; \end{array}$ This example only creates a thread for key generation, then schedules it by $schedule (2)$ . This thread stores the public key returned by the oracle $Okeygen$ in the global store location $pkg$ , and returns control to the initial thread by $schedule (1)$ .

Following the annotations of Example 4.10, this example uses two private global store locations, $skfile$ and $pkfile$ , to store the private and public keys, so ${Loc}_{priv} = {skfile, pkfile}$ . It also uses the global store location $pkg$ , so ${Loc}_{g} = {Loc}_{priv} \cup {pkg}$ . Assuming keys are represented by strings, the initial global store is ${globalstore}_{0} = {skfile \mapsto "", pkfile \mapsto "", pkg \mapsto ""}$ . The initial set of available modules is ${M I}_{0} = G_{getMI} (Q_{0}) = {(μ_{keygen}, Once)}$ , and the initial configuration is $C_{0} (Q_{0}, {program}_{0}) = [⟨ \emptyset, {program}_{0}, [], \emptyset ⟩], {globalstore}_{0}, 1, {M I}_{0}, []$ .

Detailing the reductions of this configuration would take too much space, but we still give some information on the configuration obtained after evaluating $μ_{keygen} . init ()$ . We use this configuration in other examples below. This configuration is obtained after launching the thread for key generation, so it has 2 threads, the active thread is thread 2, and no event has been executed, hence it is $C_{1} \overset{def}{=} [{th}_{1}, {th}_{2}], {globalstore}_{1}, 2, {M I}_{1}, {events}_{1}$ with ${events}_{1} = []$ . The second thread uses the module $μ_{keygen}$ given in Example 7.1. After evaluating $μ_{keygen} . init ()$ , we obtain ${th}_{2} \overset{def}{=} ⟨ {env}_{2}, {pe}_{2}, {stack}_{2}, {store}_{2} ⟩$ , where $\begin{array}{rcl} {env}_{prim} is the environment after evaluating program (μ_{prim}), \\ {env}_{2} \overset{def}{=} {env}_{prim} \oplus {μ_{keygen} . init \mapsto {tagfunction}^{keygen, τ_{1}} [{env}_{prim} \cup {token \mapsto l_{1}}, \\ () \to (lines 2–14 of Example 7.1)]}, \\ {pe}_{2} \overset{def}{=} {tagfunction}^{Okeygen, τ_{2}} [{env}_{2} \oplus {token \mapsto l_{2}}, \\ () \to (lines 5–13 of Example 7.1)] (), \\ {stack}_{2} \overset{def}{=} [({env}_{2}, pkg : = [\cdot]); ({env}_{2}, [\cdot]; schedule (1)); ({env}_{2}, let_= [\cdot];;)], \\ {store}_{2} \overset{def}{=} {l_{1} \mapsto Invalid, l_{2} \mapsto Callable} . \end{array}$ Thread 2 first initializes the module $μ_{prim}$ , which creates the environment ${env}_{prim}$ . Next, it initializes the module $μ_{keygen}$ : it creates the store location $l_{1}$ for the token of $μ_{keygen} . init$ and defines $μ_{keygen} . init$ , which leads to the environment ${env}_{2}$ . Then it goes into evaluation contexts to evaluate $μ_{keygen} . init ()$ , which leads to the stack ${stack}_{2}$ . The evaluation of $μ_{keygen} . init ()$ sets the token of $μ_{keygen} . init$ , in location $l_{1}$ , to $Invalid$ , creates the location $l_{2}$ initialized to $Callable$ for the token of oracle $Okeygen$ , and replaces $μ_{keygen} . init ()$ with the corresponding closure, which leads to the current expression ${pe}_{2}$ .

The code executed until configuration $C_{1}$ does not alter the global store, so ${globalstore}_{1} = {globalstore}_{0}$ . The execution of the $addthread$ expression removes $(μ_{keygen}, Once)$ from the set of available modules, since it can be used only once. Hence ${M I}_{1} = \emptyset$ .

8. Proof of security

This section presents the proof of correctness of our compiler. We give ourselves a CryptoVerif process $Q_{0}$ that corresponds to a cryptographic protocol. Using our compiler, we generate modules $M_{g}$ that correspond to the roles present inside $Q_{0}$ , as explained in the previous section. We consider an adversary interacting with the protocol implementation, modeled as an OCaml program ${program}_{0}$ that uses the generated modules in $M_{g}$ . As explained in Section 2, when CryptoVerif shows that $Q_{0}$ satisfies a certain security property, it shows that for any CryptoVerif adversary $Q_{adv}$ , the probability that $Q_{0} ∣ Q_{adv}$ breaks the security property is bounded by a certain bound, which CryptoVerif computes. Our goal is to show that the same probability bound also applies to the generated implementation, that is, the probability that ${program}_{0}$ breaks the security property is bounded by the same bound. To prove this property, we build from the OCaml adversary ${program}_{0}$ a CryptoVerif adversary $Q_{adv} (Q_{0}, {program}_{0})$ that simulates ${program}_{0}$ . We prove that $Q_{adv} (Q_{0}, {program}_{0}) ∣ Q_{0}$ and ${program}_{0}$ using $M_{g}$ behave similarly, hence they have the same probability of breaking the security property. To achieve this goal, we need to prove, firstly, that the translations of the oracles behave in the same way as the CryptoVerif oracles, and secondly, that our simulation is sound.

In Section 8.1, we state our assumptions on the cryptographic primitives, and show that the primitives behave correctly independently of the rest of the program. In Section 8.2, we prove that the OCaml translation of a CryptoVerif oracle behaves like the oracle. In Section 8.3, we define the CryptoVerif adversary that simulates the OCaml adversary ${program}_{0}$ . Finally, in Section 8.4, we prove that the CryptoVerif adversary interacting with $Q_{0}$ behaves like the OCaml adversary interacting with the generated implementation. This result shows the desired correctness of our compiler.

8.1. Correctness of cryptographic primitives

Let us first formalize the assumptions we make about the implementation of cryptographic primitives. Let ${program}_{prim} \overset{def}{=} program (μ_{prim})$ be the program of the module that defines the primitives and ${interface}_{prim} \overset{def}{=} interface (μ_{prim})$ be its interface. The interface ${interface}_{prim}$ consists of the function ${random}_{ℓ}$ , the functions $G_{f} (f)$ for each CryptoVerif function f, and the functions $G_{random} (T)$ , $G_{ser} (T)$ , $G_{deser} (T)$ , and $G_{pred} (T)$ for each CryptoVerif type T for which these functions are used in the translation, as described in Section 4.3. (The functions $G_{ser} (T)$ and $G_{deser} (T)$ are either both present or both absent in ${interface}_{prim}$ .) We rely on the following assumptions.

Assumption 8.1.
There are no $schedule$ , $addthread$ , $return$ , nor $event$ operations and no global store locations in ${program}_{prim}$ .

An OCaml semantic configuration in which the current thread does not use $addthread$ , $return$ , $event$ , $schedule$ operations, nor global store locations reduces by using the (Thread) reduction rule $th \to_{p} {th}^{'}$ , so we can reduce it by considering as configuration only a thread $th$ . We denote by $T T$ traces over threads.

Let ${th}_{0}^{s} \overset{def}{=} ⟨ \emptyset, {program}_{prim};;, [], \emptyset ⟩$ be a thread configuration that evaluates only the implementation of the cryptographic primitives module.
Assumption 8.2.
There exists a unique complete thread trace $T T$ beginning at the configuration ${th}_{0}^{s}$ and there exists ${env}_{prim}$ such that the last configuration of the trace $T T$ is: $th = ⟨ {env}_{prim}, ε, [], \emptyset ⟩$

This assumption means that there are no uncaught exceptions, no access to the store, and no $random$ operations in the initialization of the module $μ_{prim}$ , so that the environment ${env}_{prim}$ is always the same. Typically, the initialization just defines functions, so this assumption is not restrictive. Random choices and a limited access to the store explained below are allowed during calls to primitives. By definition of a module, we have ${interface}_{prim} \subseteq Dom ({env}_{prim})$ .
Assumption 8.3.
For each CryptoVerif type T, OCaml values of the corresponding type $G_{T} (T)$ do not contain closures nor store or global store locations.

This assumption formalizes that data passed to or received from generated code is immutable, as mentioned in Assumption (A6): such data does not contain locations.

To establish the correspondence between CryptoVerif values and OCaml values, we define a function $G_{val T}$ , which maps each CryptoVerif bitstring a to its associated value v in OCaml. For a given type T, $G_{val T}$ must be a bijection between T and the set of OCaml values of type $G_{T} (T)$ satisfying the predicate function $G_{pred} (T)$ . Furthermore, the OCaml value $true$ and the CryptoVerif value $true$ are such that $G_{val bool} (true) = true$ , and the same goes for false. We extend this function to events by $G_{ev} (ev (a_{1}, \dots, a_{j})) = ev (G_{val T_{1}} (a_{1}), \dots, G_{val T_{j}} (a_{j}))$ if $ev$ is of type $T_{1} \times \dots \times T_{j}$ . This function is naturally extended to lists of events.

The next assumption states that the primitives have been correctly implemented, following Assumption (A2): the implementation of the cryptographic primitives in ${interface}_{prim}$ emulates the corresponding behavior of CryptoVerif, as explained below.
Assumption 8.4 (Correct primitives).

For each CryptoVerif function f of type $T_{1} \times \dots \times T_{n} \to T$ , for each CryptoVerif values $a_{1}, \dots, a_{n}$ of types $T_{1}, \dots, T_{n}$ , there exist $env$ and $store$ such that $\begin{array}{rcl} ⟨ \emptyset, {env}_{prim} (G_{f} (f)) (G_{val T_{1}} (a_{1}), \dots, G_{val T_{n}} (a_{n})), [], \emptyset ⟩ \\ \to^{*} ⟨ env, G_{val T} (f (a_{1}, \dots, a_{n})), [], store ⟩ . \end{array}$

For each CryptoVerif type T such that the function $G_{random} (T)$ is in ${interface}_{prim}$ , for each CryptoVerif value $a \in T$ , there exist $env$ and $store$ such that $⟨ \emptyset, {env}_{prim} (G_{random} (T)) (), [], \emptyset ⟩ \to_{1 / | T |}^{*} ⟨ env, G_{val T} (a), [], store ⟩ .$

For each CryptoVerif type T such that the function $G_{pred} (T)$ is in ${interface}_{prim}$ , for each value v of the OCaml type $G_{T} (T)$ , there exist $env$ and $store$ such that $⟨ \emptyset, {env}_{prim} (G_{pred} (T)) v, [], \emptyset ⟩ \to^{*} ⟨ env, v^{'}, [], store ⟩,$ where $v^{'} = true$ when $G_{val T}^{- 1} (v)$ exists, and $v^{'} = false$ otherwise.

For each CryptoVerif type T such that the functions $G_{ser} (T)$ and $G_{deser} (T)$ are in ${interface}_{prim}$ , for each CryptoVerif value $a \in T$ , there exists an OCaml string value $ser (T, a)$ , such that there exist $env$ and $store$ such that $⟨ \emptyset, {env}_{prim} (G_{ser} (T)) G_{val T} (a), [], \emptyset ⟩ \to^{*} ⟨ env, ser (T, a), [], store ⟩$ and there exist $env$ and $store$ such that $⟨ \emptyset, {env}_{prim} (G_{deser} (T)) ser (T, a), [], \emptyset ⟩ \to^{*} ⟨ env, G_{val T} (a), [], store ⟩ .$

If v is a non-empty list, then for each $a \in v$ , there exist $env$ and $store$ such that $⟨ \emptyset, {env}_{prim} ({random}_{ℓ}) v, [], \emptyset ⟩ \to_{\sum_{j \in S} almostunif ({1, \dots, | v |}, j)}^{*} ⟨ env, a, [], store ⟩,$ where $S \overset{def}{=} {1 ⩽ j ⩽ | v | ∣ nth (v, j) = a}$ .

Item (1) states that the implementation $G_{f} (f)$ of the cryptographic primitive f emulates f: it returns a result that matches the result of f via the mapping $G_{val T}$ from CryptoVerif values to OCaml values. In particular, $G_{f} (f)$ does not raise exceptions when its arguments correspond to CryptoVerif values of the expected type. Since at the CryptoVerif level, f can be any function that satisfies the assumptions given in the CryptoVerif specification, Item (1) just means that the implementation of f satisfies the assumptions given in the CryptoVerif specification, as mentioned in Assumption (A2). Item (2) means that the function $G_{random} (T)$ returns a uniformly distributed random element of T. Item (3) means that $G_{pred} (T)$ returns $true$ when its argument corresponds to an element of type T, and $false$ otherwise. Item (4) specifies the correctness of the serialization and deserialization functions, using an auxiliary function $ser$ such that $ser (T, a)$ is the serialized representation of the CryptoVerif value a, of type T. Finally, Item (5) guarantees that ${random}_{ℓ}$ is programmed correctly: ${random}_{ℓ} v$ returns a random element of the list v, such that the probability of returning the jth element of v is $almostunif ({1, \dots, | v |}, j)$ . In case the same element occurs several times in v, the probability of that element is then the sum of the probabilities of all its occurrences.

In contrast to the conference version [11], in this paper, we allow the cryptographic primitives to use the store for their internal computations (which often happens in practice); the store created by the primitives appears on the right-hand side of reductions in Assumption 8.4. However, we still assume that the cryptographic primitives are pure functions: their usage of the store should not have any visible side effect, so the primitives cannot communicate across calls or communicate data to the adversary or to the rest of the code using the store. This assumption is modeled in Assumption 8.4 by considering that the primitives are initially called in an empty store. Hence, they cannot access pre-existing locations (there are none), and since their return value does not contain locations, the store at the end of the call will be unreachable. We show in Proposition 8.5, that when the primitives are called with a non-empty initial store, the primitives still execute in the same way as with an empty initial store: the only difference is that the unmodified initial store is added to the current store. Therefore, the primitives still do not access the initial store and the part of the store created during the execution of the primitive becomes unreachable when the primitive returns.

In general, when primitives make probabilistic choices, they might return the same result in several traces with a different environment and store. To simplify notations, Assumption 8.4 states that this does not happen, so that we have the same environment and store in all final configurations that yield the same result. Our proof could easily be extended to the general case if desired.

The next proposition shows that the primitives always return correct results, when they are called inside an OCaml program, so possibly with a non-empty store and a non-empty stack. We prove it in Appendix C (see the Supplemental material). It is a consequence of Assumption 8.4.

Proposition 8.5 (Correct behavior of the primitives).

Let us consider a thread $th \overset{def}{=} ⟨ env, {env}_{prim} (s) v, stack, store ⟩$ .

If $s = G_{f} (f)$ , f is a CryptoVerif function of type $T_{1} \times \dots \times T_{n} \to T$ , and $v = (G_{val T_{1}} (a_{1}), \dots, G_{val T_{n}} (a_{n}))$ for some CryptoVerif values $a_{1}, \dots, a_{n}$ of types $T_{1}, \dots, T_{n}$ , then there exist ${env}^{'}$ and ${store}^{'}$ such that $th \to^{*} ⟨ {env}^{'}, G_{val T} (f (a_{1}, \dots, a_{n})), stack, {store}^{'} ⟩ .$

If $s = G_{random} (T)$ and $v = ()$ , then for each CryptoVerif value $a \in T$ , there exist ${env}^{'}$ and ${store}^{'}$ such that $th \to_{1 / | T |}^{*} ⟨ {env}^{'}, G_{val T} (a), stack, {store}^{'} ⟩ .$

If $s = G_{pred} (T)$ , then there exist ${env}^{'}$ and ${store}^{'}$ such that $th \to^{*} ⟨ {env}^{'}, v^{'}, stack, {store}^{'} ⟩,$ where $v^{'} = true$ when $G_{val T}^{- 1} (v)$ exists, and $v^{'} = false$ otherwise.

If $s = G_{ser} (T)$ and $v = G_{val T} (a)$ , then there exist ${env}^{'}$ and ${store}^{'}$ such that $th \to^{*} ⟨ {env}^{'}, ser (T, a), stack, {store}^{'} ⟩ .$

If $s = G_{deser} (T)$ and $v = ser (T, a)$ , then there exist ${env}^{'}$ and ${store}^{'}$ such that $th \to^{*} ⟨ {env}^{'}, G_{val T} (a), stack, {store}^{'} ⟩ .$

If $s = {random}_{ℓ}$ and v is a non-empty list, then for each $a \in v$ , there exist ${env}^{'}$ and ${store}^{'}$ such that $th \to_{\sum_{j \in S} almostunif ({1, \dots, | v |}, j)}^{*} ⟨ {env}^{'}, a, stack, {store}^{'} ⟩,$ where $S \overset{def}{=} {1 ⩽ j ⩽ | v | ∣ nth (v, j) = a}$ .

In all cases, we have

{store}^{'} \supseteq store

8.2. Correctness of the translation of oracle bodies

In this section, we show the correctness of the translation of oracle bodies in our compiler: we show a correspondence between the semantics of the oracle body in CryptoVerif and the semantics of its translation into OCaml.

Let $fv (M)$ , $fv (P)$ , $fv (Q)$ be the sets of free variables of the CryptoVerif term M and processes P and Q, respectively. These sets are defined as usual, except that each variable comes with its indices: for example, the free variables of the term $x [\tilde{i}]$ are $fv (x [\tilde{i}]) \overset{def}{=} {x [\tilde{i}]}$ . We extend this definition to terms and processes in which the replication indices $\tilde{i}$ have been instantiated to bitstrings: for example, $fv (x [\tilde{a}]) = {x [\tilde{a}]}$ . We extend this definition to sets of processes by $fv (Q) = ⋃_{Q \in Q} fv (Q)$ and to stacks by $fv (S) = ⋃_{((x_{1} [\tilde{a}], \dots, x_{k} [\tilde{a}]), P_{1}, P_{2}) \in S} fv (P_{1}) ∖ {x_{1} [\tilde{a}], \dots, x_{k} [\tilde{a}]} \cup fv (P_{2})$ .

Next, we define the OCaml value corresponding to a CryptoVerif table, and we use this definition to define the OCaml environment and global store corresponding to a CryptoVerif environment and to CryptoVerif tables.

Definition 8.6 (CryptoVerif table to OCaml list).

Let us consider a table $Tbl$ of type $T_{1} \times \dots \times T_{l}$ . The serialized OCaml value that corresponds to an element of this table is $G_{tblel} (Tbl, (b_{1}, \dots, b_{l})) \overset{def}{=} (ser (T_{1}, G_{val T_{1}} (b_{1})), \dots, ser (T_{l}, G_{val T_{l}} (b_{l}))) .$ Let $t = [a_{1}; \dots; a_{k}]$ be the contents of the table $Tbl$ : each $a_{i}$ is an element of the table. Let us denote $G_{tbl} (Tbl, t) \overset{def}{=} [G_{tblel} (Tbl, a_{1}); \dots; G_{tblel} (Tbl, a_{k})]$ the OCaml list corresponding to t.

Definition 8.7 (Minimal environment and global store).

$\begin{array}{l} env (E, P) \overset{def}{=} {G_{var} (x) \mapsto G_{val T_{x}} (E (x [\tilde{a}])) ∣ x [\tilde{a}] \in fv (P)} & (Environment) \\ globalstore (E, T) \overset{def}{=} {f \mapsto G_{tbl} (Tbl, T (Tbl)) ∣ (Tbl, f) \in Tables} \\ \cup {f \mapsto ser (T_{x}, a) ∣ (x [], f) \in Files, E (x []) = a} \\ \cup {f \mapsto "" ∣ (x [], f) \in Files, x not defined in E} & (Globalstore) \end{array}$

We define $env (E, M)$ and $env (E, Q)$ in the same way.

The $globalstore$ function defined above returns the global store in which the contents of the files and the tables is correct with respect to the CryptoVerif configuration elements E and $T$ . The $env$ function returns the environment corresponding to E for the free variables in P (or M, or Q).

First, we show a correspondence between a CryptoVerif term and its OCaml translation.

Lemma 8.8 (Term reduction).

Let M be a CryptoVerif term of type T. If $th = ⟨ env, G_{M} (M), stack, store ⟩ with env \supseteq {env}_{prim} \cup env (E, M),$ and $E \cdot M ⇓ a$ , then $th \to^{*} {th}^{'}$ where ${th}^{'} \overset{def}{=} ⟨ {env}^{'}, G_{val T} (a), stack, {store}^{'} ⟩$ for some ${env}^{'}$ and ${store}^{'}$ such that ${store}^{'} \supseteq store$ .

In this lemma, we consider an OCaml thread that evaluates the translation $G_{M} (M)$ of the CryptoVerif term M. We assume that its environment contains the cryptographic primitives and the minimal environment for M, as defined in Definition 8.7. We also assume that, in CryptoVerif, M evaluates to a, and we show that correspondingly, in OCaml, $G_{M} (M)$ evaluates to $G_{val T} (a)$ . The final store is an extension of the initial one, since primitives may create store locations internally. We prove this result by induction on the syntax of terms and by using Proposition 8.5 for the evaluation of cryptographic primitives.

Let us now introduce some notations that allow us to designate the various parts of OCaml semantic configurations.

Definition 8.9 (Helper functions).

For an OCaml configuration $C = [{th}_{1}, \dots, {th}_{n}], globalstore, {tj}^{'}, M I, events$ with ${th}_{tj} = ⟨ {env}_{tj}, {pe}_{tj}, {stack}_{tj}, {store}_{tj} ⟩$ for all $tj ⩽ n$ , let us define the following functions: $\begin{array}{rcl} C_{pe} (C) \overset{def}{=} {pe}_{{tj}^{'}}, C_{th} (C) \overset{def}{=} {th}_{{tj}^{'}}, \\ C_{globalstore} (C) \overset{def}{=} globalstore, C_{events} (C) \overset{def}{=} events . \end{array}$ We also define $\begin{array}{rcl} C [th \mapsto {th}^{'}, globalstore \mapsto {globalstore}^{'}, MI \mapsto {M I}^{'}, events \mapsto {events}^{'}] \\ \overset{def}{=} [{th}_{1}, \dots, {th}_{{tj}^{'} - 1}, {th}^{'}, {th}_{{tj}^{'} + 1}, \dots, {th}_{n}], {globalstore}^{'}, {tj}^{'}, {M I}^{'}, {events}^{'} . \end{array}$ In this notation, one can omit $globalstore$ , $MI$ or $events$ . When omitted, we keep the corresponding element of the configuration $C$ .

The notation $C_{pe} (C)$ denotes the current program or expression of $C$ , $C_{th} (C)$ denotes its current thread, $C_{globalstore} (C)$ its global store, and $C_{events} (C)$ its list of events. The notation $C [th \mapsto {th}^{'}, globalstore \mapsto {globalstore}^{'}, MI \mapsto {M I}^{'}, events \mapsto {events}^{'}]$ allows us to modify some elements of the configuration $C$ .

Next, we prove that the CryptoVerif oracle bodies P are correctly translated into OCaml as $G (P)$ . We extend the translation $G (P)$ to processes in which some replication indices have been instantiated into their values, using the formulas of Section 7 where replication indices i may be replaced with their value a. It is easy to see that $G (P {a / i}) = G (P)$ .

Lemma 8.10 (Inner reduction).

Let $C$ be a CryptoVerif configuration. Suppose that the program part P of $C$ is not in a return, end, call or loop form. Suppose that we have npossible reductions beginning at this configuration: $C = E, P, T, Q, S, E \to_{p_{i}} C_{i} = E_{i}, P_{i}, Q, T_{i}, S, E_{i}$ for $i ⩽ n$ . Let $C$ be an OCaml configuration such that $\begin{array}{rcl} C_{th} (C) = ⟨ env, G (P), stack, store ⟩ with env \supseteq {env}_{prim} \cup env (E, P), \\ C_{globalstore} (C) \supseteq globalstore (E, T), \\ C_{events} (C) = G_{ev} (E) . \end{array}$ Then there exist n disjoint sets of OCaml traces ${C T S}_{1}, \dots, {C T S}_{n}$ all starting at $C$ such that none of these traces is a prefix of another of these traces, $Pr [{C T S}_{i}] = p_{i}$ for all $i ⩽ n$ , and if $C^{'}$ is the last configuration of a trace in ${C T S}_{i}$ , then we have $C^{'} = C [th \mapsto {th}^{'}, globalstore \mapsto {globalstore}^{'}, events \mapsto {events}^{'}]$ where $\begin{array}{rcl} {th}^{'} = ⟨ {env}^{'}, G (P_{i}), stack, {store}^{'} ⟩ with {env}^{'} \supseteq {env}_{prim} \cup env (E_{i}, P_{i}) and {store}^{'} \supseteq store, \\ {globalstore}^{'} \supseteq globalstore (E_{i}, T_{i}), \\ {globalstore}^{'} (l) = C_{globalstore} (C) (l) for all l \notin {Loc}_{priv}, \\ {events}^{'} = G_{ev} (E_{i}) . \end{array}$

The proof of this lemma can be found in Appendix D (see the Supplemental material). This lemma is proved by cases on the process P. We use Lemma 8.8 when we need to evaluate a term. The cases $end$ and $return$ will be handled when we prove the invariant for the whole system; the oracle bodies that we translate into OCaml do not contain calls nor loops. This lemma shows that the following invariants are preserved during the evaluation of oracle bodies: the OCaml environment and global store contain the minimal environment and global store corresponding to the CryptoVerif configuration; the public part of the global store does not change; the OCaml and CryptoVerif events match. Locations may be added in the store, but the contents of existing locations does not change. We use sets of traces on the OCaml side, because the OCaml implementation of primitives may make internal random choices, leading to several traces for the same arguments and the same result, which all correspond to the same CryptoVerif trace.

8.3. Simulation of the OCaml adversary

In this section, we show how to simulate in CryptoVerif any OCaml program ${program}_{0}$ that corresponds to an adversary interacting with the protocol implementation generated from the CryptoVerif process $Q_{0}$ . Basically, we run the OCaml program ${program}_{0}$ inside the CryptoVerif function ${simulate}_{ML}$ (which is possible since these functions can represent any deterministic Turing machine). When ${program}_{0}$ needs to call an oracle of $Q_{0}$ , the function returns and the call is made by CryptoVerif. When ${program}_{0}$ needs to generate a random number, this generation is performed by CryptoVerif.

Fig. 18.

The program $Q_{adv} (Q_{0}, {program}_{0})$ .

In more detail, from the OCaml program ${program}_{0}$ , we define a CryptoVerif adversary $Q_{adv} (Q_{0}, {program}_{0})$ given in Fig. 18. We will prove that this process, when executed in parallel with $Q_{0}$ , has the same behavior as the OCaml program ${program}_{0}$ . The initial CryptoVerif configuration is then $C_{0} (Q_{0}, {program}_{0}) = C_{i} (Q_{0} ∣ Q_{adv} (Q_{0}, {program}_{0}))$ . Informally, in Fig. 18, the state s is a bitstring representation of the current OCaml semantic configuration. The oracle $O_{start}$ iterates the oracle $O_{loop}$ with initial state $s_{0} = s_{0} (Q_{0}, {program}_{0})$ , which is a bitstring representation of the initial OCaml configuration in which ${program}_{0}$ is executed. Inside $O_{loop} (s)$ , the function ${simulate}_{ML} (s)$ basically runs the OCaml program from state s, following the OCaml semantics with the following exceptions:

When the OCaml program calls an oracle, ${simulate}_{ML}$ returns $(s^{'}, o, i, args)$ where $s^{'}$ is a bitstring representation of the new OCaml semantic configuration, o is a constant among $o_{1}, o_{2}, \dots$ that encodes which oracle is called, i is the tuple of indices with which the oracle is called, and $args$ is the tuple of arguments of the oracle. In this case, $O_{loop}$ calls the corresponding oracle O (lines 10–17). If the oracle call succeeds, it uses the function ${simulate}_{ret O}$ , which replaces the oracle call with the result $r_{i, 1}, \dots, r_{i, m_{i}^{'}}$ of the oracle in the OCaml configuration $s^{'}$ (see Definition 8.13). If the oracle call fails, the call raises the exception $Match_failure$ in OCaml; the function ${simulate}_{end O}$ then replaces the oracle call with this exception in the OCaml configuration $s^{'}$ (see Definition 8.13). The execution of the program then continues with the new configuration in the next iteration.

When the OCaml program chooses a random bit, ${simulate}_{ML}$ returns $(s^{'}, o_{R}, (), ())$ where $s^{'}$ is again a bitstring representation of the current OCaml semantic configuration. In this case, $O_{loop}$ chooses a random bit (lines 18–20) and uses the function ${simulate}_{R}$ (see Definition 8.14) to integrate that random bit into the OCaml configuration $s^{'}$ . The execution of the program continues with the new configuration in the next iteration.

When the OCaml program terminates, ${simulate}_{ML}$ returns $(s^{'}, o_{S}, (), ())$ , and the CryptoVerif adversary also terminates. (The second element returned by $O_{loop}$ is $stop$ , which stops the iteration.)

The rest of this section is devoted to the formal definition of all elements used in Fig. 18.

We assume that the OCaml program ${program}_{0}$ runs in bounded time, so makes a bounded number of oracle calls. By Assumption 4.15, when an oracle O (resp. role $role$ ) is under replication, this replication has bound $N_{O}$ (resp. $N_{role}$ ). When oracle O is under replication, we let $N_{O}$ be the maximum number of calls to the same closure ${tagfunction}^{O, τ} [env, pm]$ corresponding to oracle O. When a role $role$ is under replication, we let $N_{role}$ be the maximum number of executions of $addthread (program)$ for some $program$ that contains $μ_{role}$ . These replication bounds are chosen such that the OCaml program ${program}_{0}$ never exhausts the number of oracle calls allowed by the CryptoVerif process. We let $N_{rand + calls}$ be the maximum number of oracle calls and random number generations that the OCaml program ${program}_{0}$ can make plus one. We let $N_{steps}$ be the maximum number of reduction steps of the program ${program}_{0}$ in the semantics of OCaml. Formally, we use the following definition.

Definition 8.11.

The number of calls to the closure with tag O, τ in a trace $C T$ , denoted $N_{calls} (O, τ, C T)$ , is the number of configurations $C$ such that $C_{pe} (C) = {tagfunction}^{O, τ} [env, pm] v$ in $C T$ excluding its last configuration.

The number of executions of role $role$ in a trace $C T$ , denoted $N_{exec} (role, C T)$ , is the number of configurations $C$ such that $C_{pe} (C) = addthread (program)$ where $program$ contains $program (μ_{role})$ in $C T$ excluding its last configuration.

The number of random number generations in a trace $C T$ , denoted $N_{rand} (C T)$ , is the number of configurations $C$ such that $C_{pe} (C) = random ()$ in $C T$ excluding its last configuration.

We define $\begin{array}{rcl} N_{O} \overset{def}{=} max_{C T, τ} N_{calls} (O, τ, C T) \\ N_{role} \overset{def}{=} max_{C T} N_{exec} (role, C T) \\ N_{rand + calls} \overset{def}{=} max_{C T} (N_{rand} (C T) + \sum_{O, τ} N_{calls} (O, τ, C T)) + 1 \\ N_{steps} \overset{def}{=} max_{C T} | C T |, \end{array}$ where $C T$ ranges over traces that begin with the configuration $C_{0} (Q_{0}, {program}_{0})$ .

While $N_{O}$ is an optimal bound, $N_{role}$ is not optimal. Consider for instance a process of the form $foreach i ⩽ N_{O} do O () : = \dots} foreach j ⩽ N_{role} do role {\dots$ By distributing the instantiations of $role$ on every available index i, the optimal bound of the replication j is the maximum during all executions of ${program}_{0}$ of the number of instantiations of $role$ divided by the number of calls to O made before these instantiations of $role$ . To get this optimal bound, we would need to associate each new instantiation of $role$ to the index i with the least number of associated instantiations of $role$ . Since a role is often under at most one replication, we decided not to complicate the proof with details needed to get the optimal bound.

In Fig. 18, we use a $let$ construct with pattern matching, which can be defined as follows. We define the function ${tuple}_{T_{1}, \dots, T_{j}} : T_{1} \times \dots \times T_{j} \to bitstring$ that creates a tuple with j elements (for instance by concatenating the j bitstrings with information on their length, so that they can be unambiguously recovered), and the associated projections $π_{k, T_{1}, \dots, T_{j}} : bitstring \to T_{k}$ with $k ⩽ j$ (which may return any value when their argument is not a tuple with j elements). The construct $let (x_{1} : T_{1}, \dots, x_{j} : T_{j}) = M in P$ is an abbreviation for: $\begin{array}{rcl} x \leftarrow M; x_{1} \leftarrow π_{1, T_{1}, \dots, T_{j}} (x); \dots; x_{j} \leftarrow π_{j, T_{1}, \dots, T_{j}} (x); \\ if x = {tuple}_{T_{1}, \dots, T_{j}} (x_{1}, \dots, x_{j}) then P else end \end{array}$ where x is a fresh variable. The CryptoVerif term $(M_{1}, \dots, M_{j})$ is an abbreviation for ${tuple}_{T_{1}, \dots, T_{j}} (M_{1}, \dots, M_{j})$ , where $T_{1}, \dots, T_{j}$ are the types of $M_{1}, \dots, M_{j}$ , respectively.

Let $O_{1}, \dots, O_{n}$ be the oracle names in $Q_{0}$ . We define n constants $o_{1}, \dots, o_{n}$ which are used to designate the oracles $O_{1}, \dots, O_{n}$ respectively, $o_{R}$ which corresponds to a random choice, and $o_{S}$ which corresponds to the end of the OCaml program. We define the CryptoVerif type $T_{o} \overset{def}{=} {o_{R}, o_{S}, o_{1}, \dots, o_{n}}$ , which contains all these bitstring constants.

The adversary is mainly encoded by the function ${simulate}_{ML}$ . This function takes as argument the bitstring representation $s = repr (C S)$ of a simulator configuration $C S$ . The configuration $C S$ consists of a non-instrumented OCaml configuration $C$ (with some extensions to the syntax described later) and sets $R I$ and $I$ that finitely represent the callable oracles $Q$ of the CryptoVerif configuration: $C S = \underset{C}{\underset{︸}{([{th}_{1}, \dots, {th}_{n}], globalstore, i)}}, R I, I .$ The function $repr$ is injective. We denote its inverse by ${repr}^{- 1}$ . We also define a CryptoVerif type $T_{C S}$ that consists of all bitstrings in the image of $repr$ , that is, all bitstrings that correspond to simulator configurations $C S$ . We also use the notations of Definition 8.9 for simulator configurations.

When we call an oracle or instantiate a role under replication, we must choose an unused replication index for this replication, and call the oracle or instantiate the role with that replication index. In this simulation, we will always choose the smallest replication index that has not been used yet, so that the used indices form an interval $[1, a - 1]$ and the unused indices are in $[a, N]$ where N is the bound of the considered replication. The sets $R I$ and $I$ represent the sets of callable roles and oracles, by storing the smallest index a that is not used yet.

More precisely, the set $R I$ represents the set of callable roles with their replication indices. Elements of $R I$ are either:

of the form $role [[a, + \infty [, {\tilde{a}}^{'}]$ . Intuitively, this element represents all roles $role [a^{″}, {\tilde{a}}^{'}]$ for $a^{″} ⩾ a$ , which we represent by the interval $[a, + \infty [$ . When $role [[a, + \infty [, {\tilde{a}}^{'}]$ is in $R I$ , the role $role$ is under replication, the roles $role [1, {\tilde{a}}^{'}]$ to $role [a - 1, {\tilde{a}}^{'}]$ have been used, and the roles $role [a, {\tilde{a}}^{'}]$ to $role [N_{role}, {\tilde{a}}^{'}]$ are usable;

or of the form $role [\tilde{a}]$ , which means that $role$ is not under replication and the role $role$ is callable with the replication indices $\tilde{a}$ .

The set

R I

never contains simultaneously

role [[a, + \infty [, {\tilde{a}}^{'}]

and

role [{\tilde{a}}^{″}]

for the same

role

and any a,

{\tilde{a}}^{'}

{\tilde{a}}^{″}

, and it never contains simultaneously

role [[a, + \infty [, {\tilde{a}}^{'}]

and

role [[a^{″}, + \infty [, {\tilde{a}}^{'}]

with

a \neq a^{″}

for the same

role

and

{\tilde{a}}^{'}

The set $I$ represents the set of callable oracles with their replication indices. Elements of $I$ are either:

of the form $O [[a, + \infty [, {\tilde{a}}^{'}]$ , which means that the oracle O is under replication and the oracles $O [1, {\tilde{a}}^{'}]$ to $O [a - 1, {\tilde{a}}^{'}]$ have been used, and the oracles $O [a, {\tilde{a}}^{'}]$ to $O [N_{O}, {\tilde{a}}^{'}]$ are usable;

or of the form $O [\tilde{a}]$ which means that O is an oracle not under replication that can be called with the replication indices $\tilde{a}$ .

The set

I

never contains simultaneously

O [[a, + \infty [, {\tilde{a}}^{'}]

and

O [{\tilde{a}}^{″}]

for the same O and any a,

{\tilde{a}}^{'}

{\tilde{a}}^{″}

, and it never contains simultaneously

O [[a, + \infty [, {\tilde{a}}^{'}]

and

O [[a^{″}, + \infty [, {\tilde{a}}^{'}]

with

a \neq a^{″}

for the same O and

{\tilde{a}}^{'}

Next, we define functions that manipulate these sets of oracles and roles. We define the subtraction operation $I - O [\tilde{a}]$ on sets of oracles.

If $O [[a, + \infty [, {\tilde{a}}^{'}]$ is in $I$ , then $I - (O [a, {\tilde{a}}^{'}]) \overset{def}{=} I ∖ {O [[a, + \infty [, {\tilde{a}}^{'}]} \cup {O [[a + 1, + \infty [, {\tilde{a}}^{'}]} .$

If $O [\tilde{a}]$ is in $I$ , then $I - (O [\tilde{a}]) \overset{def}{=} I ∖ {O [\tilde{a}]} .$

We define similarly the subtraction on sets of roles

R I - role [\tilde{a}]

. We also generalize this operator to sets:

R I - {{role}_{1} [\tilde{a_{1}}], \dots, {role}_{k} [\tilde{a_{k}}]} \overset{def}{=} (\dots (R I - {role}_{1} [\tilde{a_{1}}]) - \dots) - {role}_{k} [\tilde{a_{k}}] .

We let

smallest (R I, role)

be the smallest indices present for the role

role

R I

: when

\tilde{a} = smallest (R I, role)

, we have

role [\tilde{a}] \in R I

or there exist

a^{'}

and

{\tilde{a}}^{'}

such that

\tilde{a} = a^{'}, {\tilde{a}}^{'}

and

role [[a^{'}, + \infty [, {\tilde{a}}^{'}] \in R I

Let us define the function $oraclelist$ , which is similar to $oracledeflist$ but just returns the oracle name and its replication indices $\tilde{i}$ (which can be partly instantiated to values), instead of returning the entire oracle definition: $\begin{array}{l} oraclelist (0) \overset{def}{=} [] & (Nil) \\ oraclelist (Q_{1} ∣ Q_{2}) \overset{def}{=} oraclelist (Q_{1}) @ oraclelist (Q_{2}) & (Par) \\ oraclelist (foreach i^{'} ⩽ n do Q) \overset{def}{=} [O_{1} [_, \tilde{i}], \dots, O_{l} [_, \tilde{i}]] \\ when oraclelist (Q) = [O_{1} [i^{'}, \tilde{i}], \dots, O_{l} [i^{'}, \tilde{i}]] & (Repl) \\ oraclelist (role {Q) \overset{def}{=} [] & (Role) \\ oraclelist (O [\tilde{i}] (x_{1} [\tilde{i}], \dots, x_{k} [\tilde{i}]) : = P) \overset{def}{=} [O [\tilde{i}]] & (Oracle) \end{array}$ This function returns elements of the form $O [\tilde{i}]$ for oracles that are not directly under replication and $O [_, \tilde{i}]$ for oracles directly under replication. Similarly to $oracledeflist$ , this function returns an empty list when encountering a role definition.

Let us consider a process $Q^{'} = foreach i^{'} ⩽ n do Q$ . By Assumption 4.14, there is no replication in Q, and so all oracles in Q are under the same replications and have exactly the same replication indices $i^{'}$ , $\tilde{i}$ , where the indices $\tilde{i}$ are the replication indices of replications above $Q^{'}$ . So, by rule (Repl), $oraclelist (Q^{'})$ produces the list of callable oracles in Q where we replace the replication index $i^{'}$ with $_$ .

By Property 4.5, an oracle with a certain name O always takes arguments of the same types and always returns values of the same types. So we can say that the oracle $O_{i}$ takes $m_{i}$ arguments of types $T_{i, 1}, \dots, T_{i, m_{i}}$ , and returns $m_{i}^{'}$ bitstrings of types $T_{i, 1}^{'}, \dots, T_{i, m_{i}^{'}}^{'}$ . We can also define $returnoracles (O [\tilde{i}]) \overset{def}{=} oraclelist (Q)$ where Q is an oracle definition located after a $return$ statement in a body of the oracle $O [\tilde{i}]$ in $Q_{0}$ . This definition is correct because, by Property 4.5, the structure of the processes Q after any return statement of a given oracle O is always the same, so the list $oraclelist (Q)$ will be the same for each of these Q. The function $returnoracles$ can take an oracle with its replication indices partly instantiated to values: $returnoracles (O [\tilde{a}]) \overset{def}{=} returnoracles (O [\tilde{i}]) {\tilde{a} / \tilde{i}}$ .

Let us recall that we denote by $Q (role)$ the subprocess of $Q_{0}$ that corresponds to the role $role$ . For a subprocess Q of $Q_{0}$ that is under replication indices $\tilde{i}$ in $Q_{0}$ , we denote $Q [\tilde{a}]$ the process Q where we substituted elements of $\tilde{i}$ by the respective elements of $\tilde{a}$ .

Definition 8.12 (First oracle).

The first oracles of a role $role$ are the oracles that can be called when we are at the beginning of the subprocess corresponding to the role, that is, $oraclelist (Q (role))$ .

We define $add (I, R I)$ as the addition of the first oracles present in $R I$ to $I$ : $\begin{array}{rcl} add (I, R I) & \overset{def}{=} & I \cup {O [\tilde{a}] ∣ role [\tilde{a}] \in R I, O [\tilde{a}] \in oraclelist (Q (role) [\tilde{a}])} \\ \cup {O [[1, + \infty [, \tilde{a}] ∣ role [\tilde{a}] \in R I, O [_, \tilde{a}] \in oraclelist (Q (role) [\tilde{a}])} \end{array}$

Fig. 19.

Semantics followed by the simulator.

The syntax of the language of the simulator is almost the same as the language we described in Section 5, with the addition of tagged functions introduced in Section 6. We add the functional values $call (O [\tilde{a}])$ and $call (O [_, \tilde{a}])$ that replace our generated closures for the oracle O. The value $call (O [\tilde{a}])$ is used when O is not directly under replication; $call (O [_, \tilde{a}])$ is used when O is directly under replication.

We present the semantics followed by our simulator in Fig. 19. When we encounter a configuration containing a successful call to an oracle (by $call$ ) or a random operation, we cannot reduce. These operations are executed, but not inside the simulator: we stop the simulator in its current state, and in CryptoVerif, we call the requested oracle with the requested arguments, or generate a random bit. Otherwise, when the simulator configuration reduces into another configuration in the OCaml semantics, by rule (Simulator toplevel), we also reduce in the same way. By rules (FailedCall1) and (FailedCall2), we raise the exception $Bad_Call$ when the call to the oracle is invalid, as our generated code does in this case. Notice that, in the OCaml implementation, the adversary can test whether an oracle call succeeds or not, by catching the exception $Bad_Call$ . In CryptoVerif, failed calls can happen only when the called oracle is not available, and in this case, the reduction blocks. This different behavior does not give additional power to the OCaml adversary, because the adversary can test before performing the call whether it will succeed or not. The rules (FailedCall1) and (FailedCall2) implement this test. By rule (Simulator add thread), we modify the behavior of the $addthread$ construct to transform references to our generated modules $program (μ_{role})$ into references to the corresponding role ${program}^{'} (role [\tilde{a}])$ where $\tilde{a}$ are the replication indices we chose for this particular reference and $\begin{array}{rcl} {program}^{'} (role [\tilde{a}]) \overset{def}{=} let μ_{role} . init = let token = ref Callable in {tagfunction}^{role} {pm}_{role [\tilde{a}]}^{'} \\ where {pm}_{role [\tilde{a}]}^{'} \overset{def}{=} () \to \\ if (! token = Callable) \\ then (token : = Invalid; (call (O_{1} [\tilde{a_{1}}]), \dots, call (O_{k} [\tilde{a_{k}}]))) \\ else raise Bad_Call \end{array}$ where $oraclelist (Q (role) [\tilde{a}]) = [O_{1} [\tilde{a_{1}}], \dots, O_{k} [\tilde{a_{k}}]]$ , and the $\tilde{a_{j}}$ are either $\tilde{a}$ or $_, \tilde{a}$ . In particular, the initialization function defined in ${program}^{'} (role [\tilde{a}])$ returns oracles represented by $call$ values instead of closures.

The CryptoVerif function ${simulate}_{ML} : T_{C S} \to bitstring$ follows the simulator semantics defined in Fig. 19: formally, we define ${simulate}_{ML} (repr (C S)) \overset{def}{=} simreturn ({C S}^{'})$ where ${C S}^{'}$ is the configuration such that either $C S$ reduces into ${C S}^{'}$ in at most $N_{steps}$ reductions and ${C S}^{'}$ does not reduce, or $C S$ reduces into ${C S}^{'}$ in exactly $N_{steps}$ reductions, by the semantics of Fig. 19, and $simreturn ({C S}^{'})$ is defined below. (We need to bound the number of reductions to make sure that ${simulate}_{ML}$ is always defined. The proof of the simulation between OCaml and CryptoVerif, presented in the next section, shows that the simulator configuration always blocks after at most $N_{steps}$ reductions, so that we are always in the first case.)

If $C_{pe} ({C S}^{'}) = call (O [\tilde{a}]) (v_{1}, \dots, v_{l})$ , let $T_{1}, \dots, T_{l}$ be the type of the arguments of the oracle O and let o be the constant associated to O. We define $simreturn ({C S}^{'}) \overset{def}{=} (repr ({C S}^{'}), o, \tilde{a}, (G_{val T_{1}}^{- 1} (v_{1}), \dots, G_{val T_{l}}^{- 1} (v_{l}))) .$

If $C_{pe} ({C S}^{'}) = call (O [_, \tilde{a}]) (v_{1}, \dots, v_{l})$ , let $T_{1}, \dots, T_{l}$ be the type of the arguments of the oracle O, let o be the constant associated to O, and let $a^{'}$ be the value such that $O [[a^{'}, + \infty [, \tilde{a}]$ is in the set $I$ where ${C S}^{'} = C, R I, I$ . We define $simreturn ({C S}^{'}) \overset{def}{=} (repr ({C S}^{'}), o, (a^{'}, \tilde{a}), (G_{val T_{1}}^{- 1} (v_{1}), \dots, G_{val T_{l}}^{- 1} (v_{l}))) .$

If $C_{pe} ({C S}^{'}) = random ()$ , we define $simreturn ({C S}^{'}) \overset{def}{=} (repr ({C S}^{'}), o_{R}, (), ()) .$

Otherwise, we define $simreturn ({C S}^{'}) \overset{def}{=} (repr ({C S}^{'}), o_{S}, (), ()) .$

The function

{simulate}_{ML}

can be implemented by a deterministic Turing machine (since the random choices are handled outside

{simulate}_{ML}

), so it can be used as a CryptoVerif function.

When ${simulate}_{ML}$ returns $(repr ({C S}^{'}), o, \tilde{a}, (a_{1}, \dots, a_{l}))$ , the CryptoVerif process $Q_{c} (Q_{0}, {program}_{0})$ performs the corresponding oracle call $O [\tilde{a}] (a_{1}, \dots, a_{l})$ (lines 10–17 of Fig. 18). Similarly, when ${simulate}_{ML}$ returns $(repr ({C S}^{'}), o_{R}, (), ())$ , the process $Q_{c} (Q_{0}, {program}_{0})$ performs a random choice (lines 18–20), and when ${simulate}_{ML}$ returns $(repr ({C S}^{'}), o_{S}, (), ())$ , the process $Q_{c} (Q_{0}, {program}_{0})$ terminates (lines 8–9; the corresponding OCaml program also terminates).

The functions ${simulate}_{ret O}$ and ${simulate}_{end O}$ replace, in the simulator configuration, the $call$ expression with the result returned by the oracle, and raise the $Match_failure$ exception, respectively. The function ${simulate}_{ret O}$ handles the situation in which an oracle returns a result by $return$ ; the function ${simulate}_{end O}$ handles the situation in which the oracle terminates with $end$ . Formally, these functions are defined as follows.

Definition 8.13 (Simulation of oracle return).

Let us consider a simulator configuration $C S = C, R I, I$ , with $C_{pe} (C S) = call (O [\tilde{a}]) (v_{1}, \dots, v_{l}) or call (O [_, {\tilde{a}}^{'}]) (v_{1}, \dots, v_{l}) .$ When $C_{pe} (C S)$ is of the second form, we denote by $\tilde{a}$ the indices $a^{″}$ , ${\tilde{a}}^{'}$ where $a^{″}$ is such that $O [[a^{″}, + \infty [, {\tilde{a}}^{'}] \in I$ . Let $I^{'} \overset{def}{=} I - (O [\tilde{a}])$ .

We define the CryptoVerif function ${simulate}_{ret O} : T_{C S} \times bitstring \to T_{C S}$ as follows.

If the returns in oracle O end the current role, then by Property 4.11, there is only one $return$ statement in O; let Q be the oracle definition following this statement, and let $\begin{array}{rcl} {R I}^{'} & \overset{def}{=} & {role [\tilde{a}] ∣ (μ_{role}, Once) \in G_{getMI} (Q)} \\ \cup {role [[1, + \infty [, \tilde{a}] ∣ (μ_{role}, Any) \in G_{getMI} (Q)} . \end{array}$ Let $T_{1}, \dots, T_{n}$ be the types of the return value of O. We define: ${simulate}_{ret O} (repr (C, R I, I), (r_{1}, \dots, r_{n})) \overset{def}{=} repr (C^{'}, R I \cup {R I}^{'}, I^{'}),$ where $C^{'}$ is the configuration $C$ in which the current expression is replaced with the translated result: $(G_{val T_{1}} (r_{1}), \dots, G_{val T_{n}} (r_{n}))$ .

If the returns in oracle O do not end the current role, then let us define $O \overset{def}{=} returnoracles (O [\tilde{a}])$ . Let $I^{″}$ be the set $I^{'}$ to which we added the oracles present in $O$ : $I^{″} \overset{def}{=} I^{'} \cup {O^{'} [[1, + \infty [, \tilde{a}] ∣ O^{'} [_, \tilde{a}] \in O} \cup {O^{'} [\tilde{a}] ∣ O^{'} [\tilde{a}] \in O} .$ We define: ${simulate}_{ret O} (repr (C, R I, I), (r_{1}, \dots, r_{n})) \overset{def}{=} repr (C^{'}, R I, I^{″}),$ where $C^{'}$ is the configuration $C$ in which the current expression is replaced with the translated result: $(call (O_{1} [\tilde{a_{1}}]), \dots, call (O_{l} [\tilde{a_{l}}]), G_{val T_{1}} (r_{1}), \dots, G_{val T_{n}} (r_{n}))$ , with $O = {O_{1} [\tilde{a_{1}}], \dots, O_{l} [\tilde{a_{l}}]}$ and the $\tilde{a_{j}}$ are either $\tilde{a}$ or $_, \tilde{a}$ .

In all other cases (that is, $C S$ is not of the form mentioned above or a is not a tuple of n bitstrings of types $T_{1}, \dots, T_{n}$ ), ${simulate}_{ret O} (repr (C S), a)$ can take any value, since these cases are in fact not used.

Finally, we define the CryptoVerif function ${simulate}_{end O} : T_{C S} \to T_{C S}$ by: ${simulate}_{end O} (repr (C, R I, I)) \overset{def}{=} repr (C^{″}, R I, I^{'}),$ where $C^{″}$ is the configuration $C$ in which the current expression is replaced with $raise Match_failure$ . In all other cases (that is, $C S$ is not of the form mentioned above), ${simulate}_{end O} (repr (C S))$ can take any value, since these cases are in fact not used.

When the returns in oracle O end the current role, the function ${simulate}_{ret O}$ does not return the oracles following the current oracle, but adds the corresponding roles to the role set $R I$ . The programs that contain these roles can then be launched by $addthread$ .

Definition 8.14 (Random simulation).

We define the CryptoVerif function ${simulate}_{R} : T_{C S} \times bool \to T_{C S}$ by ${simulate}_{R} (repr (C, R I, I), b) \overset{def}{=} repr (C^{'} (b), R I, I),$ where $C^{'} (b)$ is the configuration $C$ in which the current expression is replaced with the OCaml boolean value $G_{val bool} (b)$ .

Let us finally define the initial state of the simulator. Let ${R I}_{0}$ be the set of initially callable roles of $Q_{0}$ with their replication indices: ${R I}_{0} \overset{def}{=} {role [] ∣ (μ_{role}, Once) \in G_{getMI} (Q_{0})} \cup {role [[1, + \infty [] ∣ (μ_{role}, Any) \in G_{getMI} (Q_{0})}$ . We define: $s_{0} (Q_{0}, {program}_{0}) \overset{def}{=} repr (([⟨ \emptyset, {program}_{0}, [], \emptyset ⟩], {globalstore}_{0}, 1), {R I}_{0}, \emptyset)$

Let us introduce notations for subprocesses of Fig. 18, used in the next example and in Definition 8.32.

Definition 8.15 (Processes).

We use the following notations: $\begin{array}{rcl} P_{loop} is the process from line 7 to line 20 in Fig. 18. \\ Q_{loop} \overset{def}{=} O_{loop} [i^{'}] (s : T_{C S}) : = P_{loop} . \\ P_{return - loop} (α) \overset{def}{=} if b_{α, r} [] then \\ let r [] : T_{C S} = loop O_{loop} [α + 1] (r_{α, r}^{'} []) in end else end \\ else r [] \leftarrow r_{α, r}^{'} []; end . \\ S_{loop} (α) \overset{def}{=} [((r_{α, r}^{'} [], b_{α, r} []), P_{return - loop} (α), end), (x [], return (x []), end)] . \end{array}$

These notations are useful to represent the CryptoVerif configuration when CryptoVerif calls ${simulate}_{ML}$ , at line 7 of Fig. 18: in iteration α of oracle $O_{loop}$ , the current process is $P_{loop} {α / i^{'}}$ , the available $O_{loop}$ oracles are $Q_{loop} {a / i^{'}}$ for $α < a ⩽ N_{rand + calls}$ , and the CryptoVerif stack is $S_{loop} (α)$ .

Example 8.16.
Let us consider again the OCaml program ${program}_{0}$ of Example 7.3 and the process $Q_{0}$ of Example 4.10. The initial state of the simulator is then $s_{0} (Q_{0}, {program}_{0}) = repr ({C S}_{0})$ with ${C S}_{0} \overset{def}{=} ([⟨ \emptyset, {program}_{0}, [], \emptyset ⟩], {globalstore}_{0}, 1), {R I}_{0}, I_{0}$ where ${R I}_{0} \overset{def}{=} {keygen []}$ , $I_{0} \overset{def}{=} \emptyset$ , and ${globalstore}_{0}$ is defined in Example 7.3.

We execute the simulator of Fig. 18 with that value of $s_{0} (Q_{0}, {program}_{0})$ ; in this example, the oracles $O_{1}, O_{2}, \dots$ are $Okeygen$ , $OA$ and $OB$ . CryptoVerif calls oracle $O_{start}$ , which iterates $O_{loop}$ . It first calls $O_{loop} [1] (s_{0} (Q_{0}, {program}_{0}))$ , which calls ${simulate}_{ML} (s_{0} (Q_{0}, {program}_{0}))$ (Fig. 18, line 7). This function starts running the simulator semantics. In the execution of the $addthread$ expression, ${role}_{1} = keygen$ , $\tilde{a}$ is empty, ${R I}^{″} = {keygen []}$ , so $keygen []$ is removed from $R I$ , which becomes ${R I}_{1} = \emptyset$ , and the corresponding oracle $Okeygen []$ is added to $I$ , which becomes $I_{1} = {Okeygen []}$ : this oracle can now be called. In the added thread, $program (μ_{keygen})$ is replaced with $\begin{array}{rcl} {program}^{'} (keygen []) & \overset{def}{=} & let μ_{keygen} . init = let token = ref Callable in {tagfunction}^{keygen} () \to \\ if (! token = Callable) \\ then (token : = Invalid; call (Okeygen [])) \\ else raise Bad_Call \end{array}$ After the evaluation of $μ_{keygen} . init ()$ , the simulator configuration is ${C S}_{1} \overset{def}{=} ([{th}_{1}^{s}, {th}_{2}^{s}], {globalstore}_{1}^{s}, 2), {R I}_{1}, I_{1}$ where $\begin{array}{rcl} {th}_{2}^{s} \overset{def}{=} ⟨ {env}_{2}^{s}, {pe}_{2}^{s}, {stack}_{2}^{s}, {store}_{2}^{s} ⟩, \\ {env}_{2}^{s} \overset{def}{=} {env}_{prim} \oplus {μ_{keygen} . init \mapsto {tagfunction}^{keygen, τ_{1}} [{env}_{prim} \cup {token \mapsto l_{1}}, {pm}_{keygen []}^{'}]}, \\ {pe}_{2}^{s} \overset{def}{=} call (Okeygen []) (), \\ {stack}_{2}^{s} \overset{def}{=} [({env}_{2}^{s}, pkg : = [\cdot]); ({env}_{2}^{s}, [\cdot]; schedule (1)); ({env}_{2}^{s}, let_= [\cdot];;)], \\ {store}_{2}^{s} \overset{def}{=} {l_{1} \mapsto Invalid} . \end{array}$ The execution of this thread in the simulator is fairly similar to the one in OCaml, discussed in Example 7.3. It first initializes the module $μ_{prim}$ , which creates the environment ${env}_{prim}$ . Next, it initializes the module $μ_{keygen}$ : it creates the store location $l_{1}$ for the token of $μ_{keygen} . init$ and defines $μ_{keygen} . init$ , which leads to the environment ${env}_{2}^{s}$ . Then it goes into evaluation contexts to evaluate $μ_{keygen} . init () ()$ , which leads to the stack ${stack}_{2}^{s}$ . The evaluation of $μ_{keygen} . init ()$ sets the token of $μ_{keygen} . init$ , in location $l_{1}$ , to $Invalid$ and replaces $μ_{keygen} . init ()$ with the corresponding $call$ value, which leads to the current expression ${pe}_{2}^{s}$ . In contrast to the OCaml execution, no token is created for oracle $Okeygen$ , so location $l_{2}$ does not appear. The global store remains unchanged: ${globalstore}_{1}^{s} = {globalstore}_{0}$ .

At this configuration, the function ${simulate}_{ML}$ stops and returns to CryptoVerif to evaluate the call to oracle $Okeygen []$ . The CryptoVerif configuration at this point is $C_{1} = E_{1}, P_{loop} {1 / i^{'}}, T_{1}, Q_{1}, S_{loop} (1), E_{1}$ where $T_{1} = \emptyset$ since this example does not use tables; $Q_{1} = {Q_{loop} {a / i^{'}} ∣ 1 < a ⩽ N_{rand + calls}} \cup {Okeygen [] () : = \dots (as in Q_{0})}$ since $O_{loop}$ has been called with index 1 and is still available for larger indices, $Okeygen$ has not been called yet, so it is available, $OA$ and $OB$ will become available only after the return from $Okeygen$ ; $E_{1} = []$ since no events have been executed so far.
8.4. Correspondence between the CryptoVerif and OCaml systems

In this section, we prove our main security theorem by relating the CryptoVerif and OCaml systems.

Similarly to the definition of $Pr [C :^{(CV)} D]$ in Section 4.2, we define the probability of breaking the security property associated to D in OCaml: $Pr [C :^{(ML)} D]$ is the probability of the set of complete OCaml traces starting at $C$ and such that the list of events $events$ in their last configuration satisfies $D (G_{ev}^{- 1} (events)) = true$ . Our goal is to prove that, for all protocols $Q_{0}$ , OCaml adversaries ${program}_{0}$ , and distinguishers D, we have $Pr [C_{0} (Q_{0}, {program}_{0}) :^{(CV)} D] = Pr [C_{0} (Q_{0}, {program}_{0}) :^{(ML)} D] .$ As explained in Section 2, this result shows the correctness of our compiler.

To that order, we first introduce an intermediate semantics for CryptoVerif that decomposes the evaluation of the function ${simulate}_{ML}$ into several small steps. We easily relate this semantics to the semantics of CryptoVerif. Next, in Section 8.4.2, we relate the intermediate semantics to the OCaml semantics. For this purpose, we introduce a relation between intermediate semantic configurations and OCaml traces, that, in particular, ensures that the events are the same on both sides and we prove that this relation is preserved by reduction. Finally, in Section 8.4.3, we use these results to prove our main theorem.

8.4.1. Intermediate semantics

We introduce extended CryptoVerif configurations $C^{cs}$ , which are configurations of the form $C$ or $C$ , $steps$ , $C S$ , where $C S$ is a simulator configuration and $steps$ is the maximum number of reductions of $C S$ that can still be performed. (We use the field $steps$ to guarantee termination.) The configurations $C$ , $steps$ , $C S$ serve to represent the state of the system during the evaluation of the function ${simulate}_{ML}$ . We define a reduction relation ⇝ on the extended configurations $C^{cs}$ .

Definition 8.17.
Let us define the reduction relation ⇝ such that: $\begin{array}{l} \frac{\begin{matrix} E, P, T, Q, S, E \to_{p} C^{'} \\ P is not of the form x [a] \leftarrow {simulate}_{ML} (s [a]); P^{'} for any x, a, P^{'} \end{matrix}}{E, P, T, Q, S, E ⇝_{p} C^{'}} & (CryptoVerif) \\ \frac{E (s [a]) = repr (C S)}{\begin{matrix} E, x [a] \leftarrow {simulate}_{ML} (s [a]); P^{'}, T, Q, S, E \\ ⇝ E, x [a] \leftarrow {simulate}_{ML} (s [a]); P^{'}, T, Q, S, E, N_{steps}, C S \end{matrix}} & (Enter Simulator) \\ \frac{C S \to {C S}^{'} steps > 0}{\begin{matrix} E, P, T, Q, S, E, steps, C S ⇝ E, P, T, Q, S, E, steps - 1, {C S}^{'} \end{matrix}} & (Simulator) \\ \frac{C S does not reduce or steps = 0}{\begin{matrix} E, x [a] \leftarrow {simulate}_{ML} (s [a]); P^{'}, T, Q, S, E, steps, C S \\ ⇝ E [x [a] \mapsto simreturn (C S)], P^{'}, T, Q, S, E \end{matrix}} & (Leave Simulator) \end{array}$

When encountering a configuration $C = E, P, T, Q, S, E$ such that P is of the form $x [a] \leftarrow {simulate}_{ML} (s [a]); P^{'}$ and $E (s [a]) = repr (C S)$ , we reduce $C$ into an extended configuration $C$ , $N_{steps}$ , $C S$ by (Enter Simulator). We reduce $C S$ by (Simulator) until it blocks or the number of allowed reductions $N_{steps}$ is exhausted, and then we resume the CryptoVerif reductions by (Leave Simulator).

In the next lemma and proposition, we relate traces using ⇝ to traces using →, to prove that all events have the same probability in these two semantics. These results are proved in Appendix E (see the Supplemental material).
Lemma 8.18.
Let $C$ be a CryptoVerif configuration.
If $C \to_{p} C^{'}$ , then there is a trace $C ⇝_{p}^{} C^{'}$ and all intermediate configurations in this trace (if any) are of the form* $C$ , $steps$ , $C S$ .

If $C$ does not reduce by →, then it does not reduce by ⇝ either.

We denote by $Pr [C^{cs} :^{(⇝)} D]$ the probability of the set of complete CryptoVerif traces using ⇝ starting at $C^{cs}$ and such that the list of events $E$ in their last configuration satisfies $D (E) = true$ . The next proposition shows that all events have the same probability in the intermediate semantics as in the CryptoVerif semantics. Proposition 8.19.
$Pr [C :^{(⇝)} D] = Pr [C :^{(CV)} D]$ .
Proof sketch.
We partition the set of complete traces using → and beginning at $C$ into two: the traces ${C T S}_{true}$ that verify D and the traces ${C T S}_{false}$ that do not verify D. By using Lemma 8.18, we convert these sets into two sets of traces using ⇝, ${C T S}_{true}^{cs}$ and ${C T S}_{false}^{cs}$ . Traces in ${C T S}_{true}^{cs}$ verify D, and traces in ${C T S}_{false}^{cs}$ do not verify D, and $Pr [{C T S}_{b}] = Pr [{C T S}_{b}^{cs}]$ for $b \in {true, false}$ . These two sets form a partition of the set of complete traces using ⇝. □

8.4.2. Relation between the intermediate semantics and the OCaml semantics

In this section, we first define a relation between the intermediate semantics and the OCaml semantics. Then, we prove that this relation holds, which implies that $Pr [C_{0} (Q_{0}, {program}_{0}) :^{(⇝)} D] = Pr [C_{0} (Q_{0}, {program}_{0}) :^{(ML)} D]$ .

Since the definition of the relation is fairly complex, we proceed in several steps. We first define an invariant on the simulator configurations, which intuitively means that each oracle is in a single status (possibly available in the future, available for immediate calls, already called) and that oracles available in different threads are distinct. To formalize this invariant, we first define the sets of oracles represented by $I$ and $R I$ .

Definition 8.20 (Concretization of $I$ and $R I$ ).

Let us define the sets of oracles $O^{\infty} (I)$ and $O^{\infty} (R I)$ represented by $I$ and $R I$ , respectively: $\begin{array}{rcl} O^{\infty} (I) \overset{def}{=} {O [b, {\tilde{a}}^{'}] ∣ O [[a, + \infty [, {\tilde{a}}^{'}] \in I, a ⩽ b} \cup {O [\tilde{a}] ∣ O [\tilde{a}] \in I} \\ O^{\infty} (R I) \overset{def}{=} {O [b, \tilde{a}] ∣ role [\tilde{a}] \in R I, O [_, \tilde{a}] \in oraclelist (Q (role) [\tilde{a}]), 1 ⩽ b} \\ \cup {O [\tilde{a}] ∣ role [\tilde{a}] \in R I, O [\tilde{a}] \in oraclelist (Q (role) [\tilde{a}])} \\ \cup {O [b, {\tilde{a}}^{'}] ∣ role [[a, + \infty [, {\tilde{a}}^{'}] \in R I, O [b, {\tilde{a}}^{'}] \in oraclelist (Q (role) [b, {\tilde{a}}^{'}]), a ⩽ b} \end{array}$

The definition of $O^{\infty} (I)$ and $O^{\infty} (R I)$ ignores the replication bounds and allows the indices of oracles to go to infinity. Using unbounded indices is helpful in Definition 8.22. By Assumption 4.14, when O is a first oracle of a role $role$ under replication, O cannot be under replication in $Q (role)$ . So the last component of $O^{\infty} (R I)$ cannot contain oracles under replication.

Next, we define several sets of oracles and roles, which allow us to determine which oracles and roles are in which state (callable immediately, available later) in a simulator configuration.

Definition 8.21 (Oracle sets).

Let $O_{call} (th)$ be the set of oracles $O [\tilde{a}]$ not under replication that occur in $call$ constructs in the thread $th$ , without entering tagged functions and closures.

Let $O_{call - repl} (th)$ be the set of oracles $O [a, \tilde{a}]$ such that O is under replication, $a > N_{O}$ , and $call (O [_, \tilde{a}])$ occurs in the thread $th$ , without entering tagged functions and closures.

Let $R_{init - closure} (th)$ be the set of roles $role [\tilde{a}]$ such that there exists $env$ such that a closure ${tagfunction}^{role, τ} [env, {pm}_{role [\tilde{a}]}^{'}]$ is present in the thread $th$ , and such that $env (token)$ is bound in its store to $Callable$ .

Let $R_{init - function} (th)$ be the set of roles $role [\tilde{a}]$ such that the initialization function ${program}^{'} (role [\tilde{a}])$ is present in the thread $th$ .

Let $O_{call} (C S)$ , $R_{init - closure} (C S)$ , and $R_{init - function} (C S)$ be the unions of the corresponding sets for all threads of the configuration.

Let $C S = C, R I, I$ . Let $willbeavailable (C S)$ be the set of oracles that can eventually become available. This set is defined as follows. We denote the callable set of oracles: $callable (C S) \overset{def}{=} O^{\infty} (I) \cup O^{\infty} (R I) \cup O^{\infty} (R_{init - closure} (C S) \cup R_{init - function} (C S))$ We let $oracleset (Q)$ (resp. $oracleset (P)$ ) be the set of oracles that may be defined by the process Q (resp. P), defined as follows: $\begin{array}{rcl} oracleset (0) \overset{def}{=} \emptyset \\ oracleset (Q_{1} ∣ Q_{2}) \overset{def}{=} oracleset (Q_{1}) \cup oracleset (Q_{2}) \\ oracleset (foreach i^{'} ⩽ n do Q) \overset{def}{=} ⋃_{b = 1}^{n} oracleset (Q {b / i^{'}}) \\ oracleset (role {Q) \overset{def}{=} oracleset (Q) \\ oracleset (O [\tilde{a}] (x_{1} [\tilde{a}], \dots, x_{k} [\tilde{a}]) : = P) \overset{def}{=} {O [\tilde{a}]} \cup oracleset (P) \\ oracleset (P) \overset{def}{=} oracleset (Q) where Q is an oracle definition located after a return in P, \\ or \emptyset if there is no return in P . \end{array}$ By Property 4.5, the result is independent of the chosen $return$ statement in the last formula.

We let ${returnoracles}^{'} (O [\tilde{a}]) \overset{def}{=} oracleset (P {\tilde{a} / \tilde{i}})$ where oracle O is defined by $O [\tilde{i}] (x_{1} [\tilde{i}], \dots, x_{k} [\tilde{i}]) : = P$ in $Q_{0}$ . Finally, we define $willbeavailable (C S) \overset{def}{=} ⋃_{O [\tilde{a}] \in callable (C S)} {returnoracles}^{'} (O [\tilde{a}])$ .

The definition of $O_{call - repl} (th)$ may be surprising, as it considers $O [a, \tilde{a}]$ with a greater than the replication bound $N_{O}$ . We have made this choice to guarantee that $O_{call - repl} (th)$ is always included in $O^{\infty} (I)$ : the indices up to $N_{O}$ may have been consumed by calls already made to the oracle, while the indices greater than $N_{O}$ always remain, because we make at most $N_{O}$ calls to this oracle by definition of $N_{O}$ . This property is exploited in Item (O2) of Definition 8.22.

The sets $R_{init - closure} (C S)$ and $R_{init - function} (C S)$ are sets of roles with their replication indices, which can be seen as a role set $R I$ . The set $O^{\infty} (R_{init - closure} (C S) \cup R_{init - function} (C S))$ is the set of the first oracles of roles present in $R_{init - closure} (C S)$ and $R_{init - function} (C S)$ .

Finally, we can define the desired invariant on simulator configurations.

Definition 8.22 (Oracles have distinct status).

Let $C S = ([{th}_{1}, \dots, {th}_{n}], globalstore, tj), R I, I$ be a simulator configuration. We say that the oracles of $C S$ have distinct status when:

The sets $O^{\infty} (I) \cup O_{call} (C S)$ , $O^{\infty} (R I)$ and $willbeavailable (C S)$ are pairwise disjoint.

The $4 n$ sets of oracles $O_{call} ({th}_{i})$ , $O_{call - repl} ({th}_{i})$ , $O^{\infty} (R_{init - function} ({th}_{i}))$ and $O^{\infty} (R_{init - closure} ({th}_{i}))$ for $i ⩽ n$ are pairwise disjoint, and are all included in $O^{\infty} (I) \cup O_{call} (C S)$ .

To understand how all these oracle sets interact, let us present the flow of an oracle not under replication $O [\tilde{a}]$ in these sets.

Initially, if the oracle occurs at the beginning of the process, it is in $O^{\infty} (R I)$ ; otherwise, it is in $willbeavailable (C S)$ .

For an oracle occurring at the beginning of a role, when the role containing it is instantiated using $addthread$ , the oracle moves from $O^{\infty} (R I)$ to $O^{\infty} (R_{init - function} (th))$ . It is also added into $O^{\infty} (I)$ .

When the initialization function of the role is reduced into a closure, the oracle moves from $O^{\infty} (R_{init - function} (th))$ to $O^{\infty} (R_{init - closure} (th))$ .

When the initialization function of the role is called, the oracle moves from $O^{\infty} (R_{init - closure} (th))$ to $O_{call} (th)$ .

When the oracle itself is called, it is removed from $O^{\infty} (I)$ , and when the call to the oracle disappears from the thread, it is removed from $O_{call} (th)$ . The oracles made available after the call are removed from $willbeavailable (C S)$ and added either to $O^{\infty} (R I)$ if they start a role or to $O^{\infty} (I)$ and $O_{call} (th)$ if they do not start a role.

The case of an oracle under replication is fairly similar, using $O_{call - repl} (th)$ instead of $O_{call} (th)$ . Definition 8.22 ensures that an oracle cannot be simultaneously in two different sets. (We let indices go to infinity in $O^{\infty}$ to make sure that we cannot have simultaneously $O [[a^{'}, + \infty [, \tilde{a}] \in I$ with $a^{'} > N_{O}$ and $O [b, \tilde{a}] \in willbeavailable (C S)$ for all b. Indeed, if we bounded the indices to $N_{O}$ , no oracle would correspond to $O [[a^{'}, + \infty [, \tilde{a}]$ when $a^{'} > N_{O}$ , so this situation would not be prevented by Item (O1). It is prevented using $O^{\infty}$ .)

Example 8.23.
Let us show that the oracles of the initial simulator configuration ${C S}_{0}$ of Example 8.16 have distinct status. We have $O^{\infty} (I_{0}) = \emptyset$ , $O_{call} ({C S}_{0}) = \emptyset$ , $O^{\infty} ({R I}_{0}) = {Okeygen []}$ , and $willbeavailable ({C S}_{0}) = {OA [i] ∣ 1 ⩽ i ⩽ N_{1}} \cup {OB [i] ∣ 1 ⩽ i ⩽ N_{2}}$ : the oracle $Okeygen$ can be called immediately, just by starting the role $keygen$ , the oracles $OA$ and $OB$ will be available later. Hence Item (O1) holds. All sets of Item (O2) are empty, so that item holds as well.

Let us also show that the oracles of the simulator configuration ${C S}_{1}$ of Example 8.16 have distinct status. We have $O^{\infty} (I_{1}) = {Okeygen []}$ , $O_{call} (C S) = {Okeygen []}$ since the only $call$ outside tagged functions and closures is $call (Okeygen [])$ in ${th}_{2}^{s}$ , $O^{\infty} ({R I}_{1}) = \emptyset$ , and $willbeavailable ({C S}_{1}) = willbeavailable ({C S}_{0})$ : a $call$ to oracle $Okeygen []$ occurs in the thread ${th}_{2}^{s}$ , and that call is allowed as shown by $I_{1}$ ; the oracles $OA$ and $OB$ will be available later. Hence Item (O1) holds. The only non-empty set in Item (O2) is $O_{call} ({th}_{2}^{s}) = {Okeygen []}$ , so that item holds as well. (There is a closure ${tagfunction}^{role, τ} [env, {pm}_{role [\tilde{a}]}^{'}]$ in ${th}_{2}^{s}$ , but its token is $Invalid$ because it has already been called.)

As a second step in our definition of the relation between the intermediate semantics and the OCaml semantics, we define an invariant of the intermediate semantics that shows how the sets $I$ and $R I$ of the simulator configuration represent the contents of the set of callable oracle definitions $Q$ .
Definition 8.24 (Relation between $I$ , $R I$ and $Q$ ).

Let us define the sets of oracles $O (I)$ and $O (R I)$ represented by $I$ and $R I$ , respectively: $\begin{array}{rcl} O (I) \overset{def}{=} {O [b, {\tilde{a}}^{'}] ∣ O [[a, + \infty [, {\tilde{a}}^{'}] \in I, a ⩽ b ⩽ N_{O}} \cup {O [\tilde{a}] ∣ O [\tilde{a}] \in I} \\ O (R I) \overset{def}{=} {O [b, \tilde{a}] ∣ role [\tilde{a}] \in R I, O [_, \tilde{a}] \in oraclelist (Q (role) [\tilde{a}]), 1 ⩽ b ⩽ N_{O}} \\ \cup {O [\tilde{a}] ∣ role [\tilde{a}] \in R I, O [\tilde{a}] \in oraclelist (Q (role) [\tilde{a}])} \\ \cup {O [b, {\tilde{a}}^{'}] ∣ role [[a, + \infty [, {\tilde{a}}^{'}] \in R I, \\ O [b, {\tilde{a}}^{'}] \in oraclelist (Q (role) [b, {\tilde{a}}^{'}]), a ⩽ b ⩽ N_{role}} \end{array}$ We write $Q \leftrightarrow R I, I$ when the following two properties hold:

$Q$ consists of exactly one element $O [\tilde{a}] (x_{1} [\tilde{a}] : T_{1}, \dots, x_{k} [\tilde{a}] : T_{k}) : = P$ for each $O [\tilde{a}]$ present in the set $O (I) \cup O (R I)$ . We denote by $Q (O [\tilde{a}])$ this element of $Q$ .

If $O [[a, + \infty [, {\tilde{a}}^{'}] \in I$ , then there exist a process Q and an index i such that i does not occur in $fv (Q)$ and for all $b \in {a, \dots, N_{O}}$ , we have $Q (O [b, {\tilde{a}}^{'}]) = Q {b / i}$ .

In contrast to the sets we defined in Definition 8.20, the indices of oracles in $Q$ are bounded by the replication bounds. So we redefine sets of oracles $O (R I)$ and $O (I)$ that correspond to $R I$ and $I$ , but with indices bounded by $N_{O}$ and $N_{role}$ as appropriate. The sets $O (R I)$ and $O (I)$ are included in $O^{\infty} (R I)$ and $O^{\infty} (I)$ , respectively. The set of processes $Q$ corresponds to $R I$ , $I$ when it contains exactly one definition for each oracle in $O (I) \cup O (R I)$ . Furthermore, in case an oracle is under replication, the corresponding elements of $Q$ all have the same form; they differ only by the value of the replication index. We enforce this property in the last item of Definition 8.24.

Example 8.25.
Let us consider the simulator configuration ${C S}_{1}$ and the CryptoVerif configuration $C_{1}$ of Example 8.16. Let $Q_{0} = {Okeygen [] () : = \dots (as in the process Q_{0} of Example 4.10)}$ . Since $O (I_{1}) = {Okeygen []}$ and $O ({R I}_{1}) = \emptyset$ , we have $Q_{0} \leftrightarrow {R I}_{1}, I_{1}$ and $Q_{1} = {Q_{loop} {a / i^{'}} ∣ 1 < a ⩽ N_{rand + calls}} \cup Q_{0}$ . Hence, the sets ${R I}_{1}$ and $I_{1}$ correctly represent the callable oracle definitions $Q_{0}$ that come from the protocol under consideration. The set of all callable oracle definitions $Q_{1}$ additionally contains oracle definitions $Q_{loop}$ that come from the simulator.

As a third step, we relate OCaml and simulator threads. To define this relation, we start from a simulator thread. We first replace the simulator role initialization with the OCaml one using the function $replaceinitpm$ (Definition 8.26). Next, we replace $call$ functional values with the corresponding closures (defined in Definition 8.27) using the function $replacecalls$ (Definition 8.28). Finally, we add the part of the store that contains the tokens (Definition 8.29) and possibly an unreachable part of the store created during calls to cryptographic primitives, and we obtain the corresponding OCaml thread. The full relation between OCaml and simulator threads is defined in Definition 8.30.
Definition 8.26 (Replace initialization).

The function $replaceinitpm$ replaces in its argument the pattern matchings corresponding to role initialization of the simulator with the OCaml module initialization: to be more precise, $replaceinitpm (th)$ replaces each occurrence of ${tagfunction}^{role} {pm}_{role [\tilde{a}]}^{'}$ in $th$ with ${tagfunction}^{role} {pm}_{μ_{role}}$ and each occurrence of ${tagfunction}^{role, τ} [env, {pm}_{role [\tilde{a}]}^{'}]$ in $th$ with ${tagfunction}^{role, τ} [env, {pm}_{μ_{role}}]$ .

This function transforms every occurrence of the tagged closures corresponding to role initialization in the simulator, which are added by the $addthread$ construct, into the corresponding tagged closures in OCaml.

Definition 8.27 (Correct closure).

Assume that $Q \leftrightarrow R I, I$ for some $R I$ , E is a CryptoVerif environment, $l_{tok}$ is a function that maps each oracle $O [\tilde{a}]$ to the location of its token, and $τ_{O}$ is a function that maps each oracle $O [_, \tilde{a}]$ to the tag τ of the corresponding closure. We define the set of closures that correspond to an oracle:

for an oracle $O [\tilde{a}] \in I$ : $\begin{array}{rcl} correctclosure (O [\tilde{a}], I, E, Q, l_{tok}, τ_{O}) \\ \overset{def}{=} {{tagfunction}^{O, τ} [env, {pm}_{Once} (Q (O [\tilde{a}]))] ∣ \\ env \supseteq {env}_{prim} \cup env (E, Q (O [\tilde{a}])), env (token) = l_{tok} (O [\tilde{a}])} \end{array}$

for an oracle $O [\tilde{a}] \notin I$ : $\begin{array}{rcl} correctclosure (O [\tilde{a}], I, E, Q, l_{tok}, τ_{O}) \\ \overset{def}{=} {{tagfunction}^{O, τ} [env, {pm}_{Once} (Q)] ∣ for any Q, env (token) = l_{tok} (O [\tilde{a}])} \end{array}$

for an oracle $O [[a^{'}, + \infty [, {\tilde{a}}^{″}] \in I$ with $a^{'} ⩽ N_{O}$ , $\begin{array}{rcl} correctclosure (O [_, {\tilde{a}}^{″}], I, E, Q, l_{tok}, τ_{O}) \\ \overset{def}{=} {{tagfunction}^{O, τ} [env, {pm}_{Any} (Q (O [a^{'}, {\tilde{a}}^{″}]))] ∣ \\ τ = τ_{O} (O [_, {\tilde{a}}^{″}]), env \supseteq {env}_{prim} \cup env (E, Q (O [a^{'}, {\tilde{a}}^{″}]))} \end{array}$

for an oracle $O [[a^{'}, + \infty [, {\tilde{a}}^{″}] \in I$ with $a^{'} > N_{O}$ , $\begin{array}{rcl} correctclosure (O [_, {\tilde{a}}^{″}], I, E, Q, l_{tok}, τ_{O}) \\ \overset{def}{=} {{tagfunction}^{O, τ} [env, {pm}_{Any} (Q)] ∣ τ = τ_{O} (O [_, {\tilde{a}}^{″}]), for any Q, env} \end{array}$

for an oracle $O [[a^{'}, + \infty [, {\tilde{a}}^{″}] \notin I$ : $correctclosure (O [_, {\tilde{a}}^{″}], I, E, Q, l_{tok}, τ_{O}) \overset{def}{=} \emptyset$

The function $correctclosure$ serves to map calls $call (R)$ in the simulator configuration into their corresponding closures in the OCaml configuration: $call (R)$ is mapped to an element of $correctclosure (R, I, E, Q, l_{tok}, τ_{O})$ by the function $replacecalls$ defined below.

In the case $O [\tilde{a}] \in I$ , we map $call (O [\tilde{a}])$ into the closure that translates the process $Q (O [\tilde{a}])$ .

The case $O [\tilde{a}] \notin I$ may be used when the oracle $O [\tilde{a}]$ has been called but the thread still contains a $call$ to this oracle. If the oracle is called again, the call will fail. The process $Q (O [\tilde{a}])$ is removed from $Q$ after execution, so we do not know which process to translate to obtain the correct closure for $O [\tilde{a}]$ , that is why the correct closures for a call to an already called oracle can contain the translation of any process Q. This translation will fail and raise the exception $Bad_Call$ regardless of the translated process Q.

Oracles under replication cannot disappear from $I$ after having been added to it: when one calls the oracle $O [_, {\tilde{a}}^{″}]$ , we just increment the counter $a^{'}$ of the element $O [[a^{'}, + \infty [, {\tilde{a}}^{″}]$ present in $I$ . We need to distinguish whether the adversary has exhausted all the $N_{O}$ calls available for this oracle or not. If there remains available calls, the process $Q (O [a^{'}, {\tilde{a}}^{″}])$ is defined, and we require that $call (O [_, {\tilde{a}}^{″}])$ is mapped into a closure that translates this process. Otherwise, if all the calls are exhausted, $a^{'} > N_{O}$ , and $Q (O [a^{'}, {\tilde{a}}^{″}])$ is not defined, but we know that the adversary will not call the oracle again, so $call (O [_, {\tilde{a}}^{″}])$ can be mapped to closures that translate any process.

The case $O [[a^{'}, + \infty [, {\tilde{a}}^{″}] \notin I$ never happens: it would mean that the oracle $O [_, {\tilde{a}}^{″}]$ can be called but there is no reference to it in the set $I$ .

Definition 8.28 (Replace $call$ ).

Let $I$ , E, $Q$ , $l_{tok}$ , $τ_{O}$ be as in Definition 8.27, $\begin{array}{rcl} replacecalls (⟨ env, pe, stack, store ⟩, I, E, Q, l_{tok}, τ_{O}) \\ \overset{def}{=} {⟨ {env}^{'}, σ (pe), σ (stack), σ (store) ⟩ ∣ if pe is a value v or an exceptional value raise v, then \\ {env}^{'} is any environment, else {env}^{'} = σ (env), where σ is a function that replaces, for each R, \\ each occurrence of call (R) with an element of correctclosure (R, I, E, Q, l_{tok}, τ_{O})} \end{array}$

The function $replacecalls$ replaces in its argument each call $call (O [\tilde{a}])$ with a closure that corresponds to the oracle $O [\tilde{a}]$ , computed by $correctclosure$ . It allows any environment when the current program or expression is a value or an exceptional value, because in these cases, the environment is not used.

Definition 8.29 (Token part of the store).

Let $I$ and $l_{tok}$ be as in Definition 8.27. Let $O$ be a set of oracles with indices of the form $O [\tilde{a}]$ . $\begin{array}{rcl} gettokens (I, O, l_{tok}) & \overset{def}{=} & {l_{tok} (O [\tilde{a}]) \mapsto Callable ∣ O [\tilde{a}] \in O \cap I} \\ \cup {l_{tok} (O [\tilde{a}]) \mapsto Invalid ∣ O [\tilde{a}] \in O ∖ I} \end{array}$

The function $gettokens$ returns the part of the store corresponding to the tokens of the closures of oracles not under replication.

In the following definitions, we use the exponent $s$ for the elements of the simulator configuration and the exponent $o$ for the elements of the OCaml configuration.

Definition 8.30 (Relation between simulator and OCaml threads).

Let ${th}^{s} = ⟨ {env}^{s}, {pe}^{s}, {stack}^{s}, {store}^{s} ⟩$ be a simulator thread, and ${th}^{o} = ⟨ {env}^{o}, {pe}^{o}, {stack}^{o}, {store}^{o} ⟩$ be an OCaml thread. Let $I$ , E, $Q$ , $τ_{O}$ be as in Definition 8.27. We say that ${th}^{o}$ matches ${th}^{s}$ knowing $I$ , E, $Q$ , $τ_{O}$ when one of the following two cases occurs:

${th}^{o}$ = $replaceinitpm ({th}^{s})$ and ${th}^{s}$ = $⟨ \emptyset$ , ${program}_{prim};; {program}^{'} ({role}_{1} [\tilde{a_{1}}]);;$ … $;;$ ${program}^{'} ({role}_{l} [\tilde{a_{l}}]);;$ ${program}^{'}$ , $[]$ , $\emptyset ⟩$ .

There is no closure, no tagged function ${tagfunction}^{t} pm$ , no $event$ , and no $return$ in ${program}^{'}$ , except in $program (μ_{role})$ in arguments of $addthread$ .

The following properties hold:

There exist ${store}^{'}$ and an injective function $l_{tok}$ that associates to each $O [\tilde{a}]$ in $O_{call} ({th}^{s})$ a store location that does not occur in ${th}^{s}$ such that $\begin{array}{rcl} ⟨ {env}^{o}, {pe}^{o}, {stack}^{o}, {store}^{'} ⟩ \in replacecalls (replaceinitpm ({th}^{s}), I, E, Q_{0}, l_{tok}, τ_{O}), \\ {store}^{'} \cup gettokens (I, O_{call} ({th}^{s}), l_{tok}) \subseteq {store}^{o} . \end{array}$

There exists an injective function $l_{init - tok}$ that associates to each role $role [\tilde{a}]$ such that a closure ${tagfunction}^{role, τ} [env, {pm}_{role [\tilde{a}]}^{'}]$ occurs in the thread ${th}^{s}$ for some $env$ and τ, a store location such that for all closures ${tagfunction}^{role, τ} [env, {pm}_{role [\tilde{a}]}^{'}]$ present in ${th}^{s}$ , we have $l_{init - tok} (role [\tilde{a}]) = env (token)$ .

The locations $l_{init - tok} (role [\tilde{a}])$ and $l_{tok} (O [{\tilde{a}}^{'}])$ are distinct for every role $role [\tilde{a}]$ and oracle $O [{\tilde{a}}^{'}]$ .

The locations $l_{init - tok} (role [\tilde{a}])$ occur only in $Dom ({store}^{s})$ and in $env (token)$ where $env$ is the environment of a tagged closure ${tagfunction}^{role, τ} [env, {pm}_{role [\tilde{a}]}^{'}]$ in ${th}^{s}$ .

For each tagged closure ${tagfunction}^{t, τ} [env, pm]$ present in ${th}^{s}$ , the tag t is a role $role$ , ${env}_{prim} \subseteq env$ , and there exist indices $\tilde{a}$ such that $pm = {pm}_{role [\tilde{a}]}^{'}$ .

There is no tagged function ${tagfunction}^{t} pm$ , no $event$ , and no $return$ in ${th}^{s}$ except in $program (μ_{role})$ in arguments of $addthread$ .

This definition relates the threads of the simulator and of OCaml. A thread can be in one of the following two states. If it satisfies Item (T1), the thread is a protocol thread that was not scheduled yet. The simulator and OCaml threads correspond by transforming the program ${program}^{'} (role [\tilde{a}])$ present in the simulator into the program of the module corresponding to the role, $program (μ_{role})$ . Otherwise, the thread satisfies Item (T2). In this case, Item (T2)(a) relates the contents of the simulator thread and the OCaml thread by replacing ${program}^{'} (role [\tilde{a}])$ with $program (μ_{role})$ as above, and by replacing calls to oracles using $call$ with a corresponding tagged closure. The tokens that determine whether oracles can be called are absent from the simulator: the value of these tokens is determined from $I$ by the function $gettokens$ , and we require that they are present in the OCaml store with their correct value. Item (T2)(b) ensures that all instances of a closure of a given role initialization $role [\tilde{a}]$ share the same store location for their tokens. This ensures that a role initialization closure is not called twice. Item (T2)(b) also ensures that all locations used for the tokens of role initialization are not accessible elsewhere. Item (T2)(c) ensures that every tagged closure present in the simulator is a correct closure for the initialization of a role. Item (T2)(d) is an invariant of the simulator that ensures that the adversary does not have access to our OCaml instrumentation features.

Example 8.31.
We use the notations $I_{1}$ , $E_{1}$ of Example 8.16, $Q_{0}$ of Example 8.25, and let $τ_{O}$ be any function. We verify that the thread ${th}_{2}$ of Example 7.3 matches the thread ${th}_{2}^{s}$ of Example 8.16 knowing $I_{1}$ , $E_{1}$ , $Q_{0}$ , $τ_{O}$ , because they satisfy Property (T2). The function $replaceinitpm$ replaces ${env}_{2}^{s} (μ_{keygen} . init)$ with ${env}_{2} (μ_{keygen} . init)$ . Let $l_{tok} = {Okeygen [] \mapsto l_{2}}$ . We have $I_{1} = {Okeygen []}$ , so $\begin{array}{rcl} correctclosure (Okeygen [], I_{1}, E_{1}, Q_{0}, l_{tok}, τ_{O}) \\ = {{tagfunction}^{Okeygen, τ} [env, {pm}_{Once} (Q_{0} (Okeygen []))] ∣ \\ env \supseteq {env}_{prim} \cup env (E_{1}, Q_{0} (Okeygen [])), env (token) = l_{2}} . \end{array}$ The process $Q_{0} (Okeygen [])$ is the definition of $Okeygen []$ in Example 4.10. It has no free variables, so $env (E_{1}, Q_{0} (Okeygen [])) = \emptyset$ . Therefore, we have ${tagfunction}^{Okeygen, τ_{2}} [{env}_{2} \oplus {token \mapsto l_{2}}, () \to (lines 5–13 of Example 7.1)] \in correctclosure (Okeygen [], I_{1}, E_{1}, Q_{0}, l_{tok}, τ_{O})$ , that is, the OCaml closure that corresponds to $Okeygen []$ is correct, so $replacecalls$ replaces ${pe}_{2}^{s}$ with ${pe}_{2}$ , hence $⟨ {env}_{2}, {pe}_{2}, {stack}_{2}, {store}_{2}^{s} ⟩ \in replacecalls (replaceinitpm ({th}_{2}^{s}), I_{1}, E_{1}, Q_{0}, l_{tok}, τ_{O})$ . Moreover, $O_{call} ({th}_{2}^{s}) = {Okeygen []}$ , so $gettokens (I_{1}, O_{call} ({th}_{2}^{s}), l_{tok}) = {l_{2} \mapsto Callable}$ , so ${store}_{2}^{s} \cup gettokens (I_{1}, O_{call} ({th}_{2}^{s}), l_{tok}) = {store}_{2}$ : the part of the store corresponding to tokens of oracles, here the token of $Okeygen []$ , is computed by $gettokens$ ; it is included in the OCaml store but not in the simulator store. Therefore, Property (T2)(a) holds.

The only closure of the form ${tagfunction}^{role, τ} [env, {pm}_{role [\tilde{a}]}^{'}]$ in ${th}_{2}^{s}$ is ${env}_{2}^{s} (μ_{keygen} . init) = {tagfunction}^{keygen, τ_{1}} [{env}_{prim} \cup {token \mapsto l_{1}}, {pm}_{keygen []}^{'}]$ , so we define $l_{init - tok} = {keygen [] \mapsto l_{1}}$ and easily verify Property (T2)(b). Property (T2)(c) also concerns the same tagged closure, and is easily verified with $\tilde{a}$ empty. There is no tagged function ${tagfunction}^{t} pm$ , no $event$ and no $return$ in ${th}_{2}^{s}$ , so Property (T2)(d) holds, which concludes the verification of Property (T2).

A similar verification can be done for ${th}_{1}$ and ${th}_{1}^{s}$ ; we leave it to the reader.

Finally, we can define our relation between the intermediate and the OCaml semantics.
Definition 8.32 (Relation between extended CryptoVerif configurations and OCaml traces).

Let $C^{cs}$ be an extended CryptoVerif configuration and $C T$ be an OCaml trace that starts with the initial configuration $C_{0} (Q_{0}, {program}_{0})$ defined in Section 7. We say that $C^{cs} \equiv C T$ when there exists an injective function $τ_{O}$ that maps oracles $O [_, \tilde{a}]$ such that $O [[a^{'}, + \infty [, \tilde{a}] \in I$ for some $a^{'}$ to tags τ, such that the following properties are all true:

$C^{cs} = E, P_{loop} {α / i^{'}}, T, Q, S_{loop} (α), E, steps, C S$ .

$C S = ([{th}_{1}^{s}, \dots, {th}_{n}^{s}], {globalstore}^{s}, tj), R I, I$ .

$C$ is the last configuration of $C T$ .

$C = [{th}_{1}^{o}, \dots, {th}_{n}^{o}], {globalstore}^{o}, tj, M I, events$ .

$Q = {Q_{loop} {a / i^{'}} ∣ α < a ⩽ N_{rand + calls}} \cup Q_{0}$ and $Q_{0} \leftrightarrow R I, I$ .

$fv (P_{loop} {α / i^{'}}) \cup fv (Q) \cup fv (S_{loop} (α)) \subseteq Dom (E)$ .

For $i ⩽ n$ , all store locations in ${Loc}_{ℓ}$ present in ${th}_{i}^{s}$ are in $Dom ({store}_{i}^{s})$ , where ${th}_{i}^{s} = ⟨ {env}_{i}^{s}, {pe}_{i}^{s}, {stack}_{i}^{s}, {store}_{i}^{s} ⟩$ .

For $i ⩽ n$ , ${th}_{i}^{o}$ matches ${th}_{i}^{s}$ knowing $I$ , E, $Q$ , $τ_{O}$ (Definition 8.30).

For all locations $l \in {Loc}_{priv}$ , l does not occur in ${th}_{1}^{s}, \dots, {th}_{n}^{s}$ except in $program (μ_{role})$ in arguments of $addthread$ .

$\forall l \in {Loc}_{priv}$ , ${globalstore}^{s} (l) = {initval}_{l}$ .

$globalstore (E, T) \subseteq {globalstore}^{o}$ .

$\forall l \notin {Loc}_{priv}$ , ${globalstore}^{s} (l) = {globalstore}^{o} (l)$ .

$M I = {(μ_{role}, Once) ∣ role [\tilde{a}] \in R I} \cup {(μ_{role}, Any) ∣ role [[a^{'}, + \infty [, \tilde{a}] \in R I}$ .

$events = G_{ev} (E)$ .

The oracles of $C S$ have distinct status (Definition 8.22).

$| C T | + steps ⩾ N_{steps}$ .

$α ⩽ N_{rand} (C T) + \sum_{O, τ} N_{calls} (O, τ, C T) + 1$ .

If $O [[a^{'}, + \infty [, \tilde{a}] \in I$ , then $a^{'} ⩽ N_{calls} (O, τ_{O} (O [_, \tilde{a}]), C T) + 1$ .

If $role [[a^{'}, + \infty [, \tilde{a}] \in R I$ , then $a^{'} ⩽ N_{exec} (role, C T) + 1$ .

The relation $C^{cs} \equiv C T$ is our main tool to relate the CryptoVerif and OCaml systems. This relation holds only when the CryptoVerif adversary is evaluating the function ${simulate}_{ML}$ (line 7 of Fig. 18), as shown by the form of the extended CryptoVerif configuration $C^{cs}$ in Item (I1). (The value α is the current value of the index $i^{'}$ , that is, the number of iterations in the loop.) Items (I1) and (I2) also ensure that there is the same number of threads in the simulator configuration $C S$ and in the OCaml configuration $C$ .

Item (I3) is an invariant on the CryptoVerif side: it relates the available oracles in $Q$ to elements of the simulator configuration. This item ensures basically that when the simulator calls an oracle present in $I$ , it is also present in $Q$ , and the oracle call in the CryptoVerif adversary (line 13 of Fig. 18) can proceed. Item (I4) is an invariant of the CryptoVerif semantics: the environment contains bindings for every free variable present in the current configuration. Item (I5) is an invariant of the simulator: each store location that occurs in a thread is present in the domain of the store. (When a location is created, it is immediately added to the store.)

Item (I6) relates the threads of the simulator and of the OCaml semantics, following Definition 8.30.

Items (I7)–(I10) relate the values of the global store in the simulator and in the OCaml semantics. The public part of the global store is the same on both sides (Item (I10)). The private part (files and tables) is empty in the simulator, since this part is handled by CryptoVerif itself (Item (I8)) and cannot be accessed by the adversary (Item (I7)). We require that the private part of the OCaml global store corresponds to the CryptoVerif configuration (Item (I9)).

Item (I11) relates the OCaml multiset of callable modules $M I$ and the simulator set of callable roles $R I$ . Item (I12) relates the OCaml and CryptoVerif events. Item (I13) guarantees that the oracles have distinct status, following Definition 8.22. This property allow us to prove that the injections $l_{tok}$ and $l_{init - tok}$ of Items (T2)(a) and (T2)(b) of Definition 8.30 are kept. (These injections appear in Item (I6).)

Items (I14)–(I17) ensure that we never reach the limits on the number of simulator steps $N_{steps}$ (Item (I14)), the number of calls to the oracles (Item (I15) for the oracle $O_{loop}$ and Item (I16) for the other oracles), and the number of calls to roles (Item (I17)), by making sure that the number of calls on the CryptoVerif side is at most the number of calls on the OCaml side. The number of calls made to oracle $O [_, \tilde{a}]$ in CryptoVerif, $a^{'} - 1$ such that $O [[a^{'}, + \infty [, \tilde{a}] \in I$ , may be less than the number of calls to that oracle in the OCaml trace, $N_{calls} (O, τ_{O} (O [_, \tilde{a}]), C T)$ , because failed calls are not counted on the CryptoVerif side.

Example 8.33.
We verify the relation $C_{1}^{cs} \equiv {C T}_{1}$ after evaluating $μ_{keygen} . init ()$ in Example 7.3. The intermediate semantic configuration $C_{1}^{cs}$ is $C_{1}^{cs} = C_{1}, steps, {C S}_{1}$ where $C_{1}$ and ${C S}_{1}$ are defined in Example 8.16 and $steps$ is $N_{steps}$ minus the number of steps executed in the simulator. The OCaml trace ${C T}_{1}$ ends at the configuration $C_{1}$ defined in Example 7.3. We use the notations of these examples.

Properties (I1) and (I2) are obvious from the form of the configurations, with $α = 1$ , $n = 2$ , and $tj = 2$ . Properties (I3), (I6) and (I13) have been verified in Examples 8.25, 8.31 and 8.23, respectively. Property (I4) can be verified by computing the value of $E_{1}$ ; we leave this detail to the reader. There is no store location in ${th}_{1}^{s}$ and the only store location of ${th}_{2}^{s}$ is $l_{1}$ , which is in $Dom ({store}_{1}^{s})$ , so Property (I5) holds.

The locations $pkfile$ and $skfile$ do not occur in ${th}_{1}^{s}$ nor ${th}_{2}^{s}$ , so Property (I7) holds. (They occur in $program (μ_{keygen})$ in the argument of $addthread$ in the initial program ${program}_{0}$ , but they disappear when $addthread$ is executed.) For $l \in {skfile, pkfile}$ , ${globalstore}_{1}^{s} (l) = {initval}_{l}$ so Property (I8) holds. We have $globalstore (E_{1}, T_{1}) = \emptyset$ because neither $sk []$ nor $pk []$ are defined in $E_{1}$ , so Property (I9) holds, and ${globalstore}_{1}^{s} (pkg) = {globalstore}_{1} (pkg)$ , so Property (I10) holds. (The verification of the correspondence between the global stores is not very interesting in this configuration. It is more interesting at the end of the execution of ${program}_{0}$ . At this point, ${globalstore}^{o} = {skfile \mapsto v_{sk}, pkfile \mapsto v_{pk}, pkg \mapsto v_{pk}}$ , since $skfile$ and $pkfile$ are written by $μ_{keygen} . init () ()$ and $pkg$ is written by ${program}_{0}$ , while ${globalstore}^{s} = {skfile \mapsto "", pkfile \mapsto "", pkg \mapsto v_{pk}}$ since the simulator calls $Okeygen$ via CryptoVerif, which does not write into files. The minimal global store $globalstore (E, T)$ contains values for $skfile$ and $pkfile$ since $sk []$ and $pk []$ are defined in the CryptoVerif environment E after calling $Okeygen$ and they should be stored in the files $skfile$ and $pkfile$ , respectively. These values are indeed in ${globalstore}^{o}$ , so Property (I9) holds. However, ${globalstore}^{s} (l)$ still contains the initial values for $l \in {skfile, pkfile}$ , so Property (I8) holds. The same value for $pkg$ appears in ${globalstore}^{s}$ and ${globalstore}^{o}$ , so Property (I10) holds.)

We have ${M I}_{1} = \emptyset$ and ${R I}_{1} = \emptyset$ , so Property (I11) holds; ${events}_{1} = []$ and $E_{1} = []$ , so Property (I12) holds. Property (I14) can be verified by counting the number of steps in OCaml and in the simulator. We omit this tedious but not difficult point here. Property (I15) holds because $α = 1$ ; there are no random number generations nor oracle calls in ${C T}_{1}$ . Properties (I16) and (I17) hold because neither $I_{1} = {Okeygen []}$ nor ${R I}_{1} = \emptyset$ contain oracles of the considered form.

The next two lemmas show that the relation $C^{cs} \equiv C T$ is preserved during execution. Lemma 8.34 shows that it holds at the beginning, as soon as the simulator reaches line 7 of Fig. 18.
Lemma 8.34.
There exists a trace $C_{0} (Q_{0}, {program}_{0}) ⇝^{} C^{cs}$ where* $C^{cs} \equiv {C T}_{0}$ and ${C T}_{0} = C_{0} (Q_{0}, {program}_{0})$ .

Lemma 8.35 shows that the relation $C^{cs} \equiv C T$ is preserved. More precisely, the relation does not hold at all steps (in particular because it holds only when the CryptoVerif adversary is executing ${simulate}_{ML}$ ), but if it holds at some point, we can continue execution so that either it holds again at a later point, or execution ends with matching events.
Lemma 8.35.
Let $C^{cs}$ be such that there exists a trace $C T$ satisfying $C^{cs} \equiv C T$ .
Either there exist n configurations $C_{1}^{cs}, \dots, C_{n}^{cs}$ and n traces $C^{cs} ⇝_{p_{1}}^{+} C_{1}^{cs}, \dots, C^{cs} ⇝_{p_{n}}^{+} C_{n}^{cs}$ such that none of these traces is a prefix of another, $\sum_{i ⩽ n} p_{i} = 1$ , and for each trace $C T$ such that $C^{cs} \equiv C T$ , there exist n pairwise disjoint trace sets ${C T S}_{1}, \dots, {C T S}_{n}$ such that all traces in these sets are extensions of $C T$ , none of these traces is a prefix of another, $Pr [{C T S}_{i}] = p_{i} \cdot Pr [C T]$ , and for each trace ${C T}^{'} \in {C T S}_{i}$ , we have $C_{i}^{cs} \equiv {C T}^{'}$ .

Or for each trace $C T$ such that $C^{cs} \equiv C T$ , the last configuration $C$ of $C T$ cannot reduce, $C^{cs} \to^{+} C_{1}^{cs}$ , the configuration $C_{1}^{cs}$ cannot reduce, and the event list $E$ of $C_{1}^{cs}$ and the event list $events$ of $C$ satisfy $events = G_{ev} (E)$ .

We prove these lemmas in Appendix F (see the Supplemental material). Let us present a proof sketch of Lemma 8.35.
Proof sketch of Lemma 8.35.
Let us take an extended CryptoVerif configuration $C^{cs}$ and an OCaml trace $C T$ such that $C^{cs} \equiv C T$ . Let $C$ be the last configuration of $C T$ . Let $C S$ be the configuration of the simulator in $C^{cs}$ and ${th}^{s}$ be the current thread of $C S$ . Case 1.
The current thread of $C S$ verifies Item (T1) of Definition 8.30, we run the initialization of the module. The programs of the current threads of $C S$ and $C$ are the same except that the occurrences of ${program}^{'} (role [\tilde{a}])$ present in $C S$ are transformed into $program (μ_{role})$ . We show that after having reduced the initialization of the primitives and the initialization of the roles on both sides, the current threads verify Item (T2). The oracles in $O^{\infty} (R_{init - function} ({th}^{s}))$ that correspond to the roles implemented in this initialization are moved to $O^{\infty} (R_{init - closure} ({th}^{s}))$ . We prove that the relation $C^{cs} \equiv C T$ is preserved.
Case 2.
The current thread of $C S$ verifies Item (T2). We distinguish cases on the form of the simulator configuration $C S$ .

Let us first look at the cases in which the configuration $C S$ does not reduce. We use the rule (Leave Simulator), thus finishing the evaluation of the function ${simulate}_{ML}$ .
If the current expression of $C S$ is $call (O_{j} [\tilde{a}]) v$ , then the result of ${simulate}_{ML}$ is such that $o = o_{j}$ , so the CryptoVerif adversary of Fig. 18 calls the oracle $O_{j}$ at line 13 in the branch $o = o_{j}$ , ends one iteration of $O_{loop}$ , and starts the next iteration until it reaches line 7. We use Lemma 8.10 and we exploit the definition of ${simulate}_{ret O_{j}}$ and ${simulate}_{end O_{j}}$ to prove that the OCaml configuration reduces similarly, by calling the OCaml function generated for oracle $O_{j}$ . The oracle $O_{j} [\tilde{a}]$ is removed from $O^{\infty} (I)$ , and from $O_{call} ({th}^{s})$ if all occurrences of $call (O_{j} [\tilde{a}])$ have disappeared. The newly available oracles, added to sets $O^{\infty} (R I)$ or $O_{call} ({th}^{s})$ and $O_{call - repl} ({th}^{s})$ , are removed from the set $willbeavailable (C S)$ . We prove that the relation $C^{cs} \equiv C T$ is preserved.

If the current expression of $C S$ is $random ()$ , then the result of ${simulate}_{ML}$ is such that $o = o_{R}$ , so the CryptoVerif adversary of Fig. 18 samples a random boolean at line 19, ends one iteration of $O_{loop}$ , and starts the next iteration until it reaches line 7. The current expression of $C S$ is replaced with $true$ with probability $1 / 2$ and $false$ with probability $1 / 2$ . The OCaml configuration reduces similarly: it samples a random boolean by evaluating $random ()$ , and the relation $C^{cs} \equiv C T$ is preserved.

Otherwise, the configuration $C S$ cannot reduce, and the corresponding configuration $C$ cannot reduce either. The result of ${simulate}_{ML}$ is such that $o = o_{S}$ , so the CryptoVerif adversary of Fig. 18 ends the current iteration of $O_{loop}$ at line 9, and ends the loop at line 4, so it also stops. The events in the final CryptoVerif and OCaml configurations match, so the second case of the lemma holds.

If the current expression of $C S$ is $addthread (program)$ , a new thread is created on both sides. If $program$ is a protocol program, then this new thread satisfies Item (T1) by definition of $addthread$ in OCaml and in the simulator and by definition of $replaceinitpm$ . The roles added in this new thread $th$ are removed from $R I$ and the corresponding oracles are added to $O^{\infty} (R_{init - function} (th))$ and to $I$ . Otherwise, the new thread satisfies Item (T2). We prove that the relation $C^{cs} \equiv C T$ is preserved.

If the current expression of $C S$ is $call (O_{j} [\tilde{a}]) v$ and the configuration $C S$ reduces by (FailedCall1) or (FailedCall2), then the simulator raises $Bad_Call$ , and the corresponding tagged function in OCaml also raises $Bad_Call$ (because the tokens in OCaml correspond to $I$ in the simulator by Item (T2)(a)). We prove that the relation $C^{cs} \equiv C T$ is preserved.

If the current expression of $C S$ is ${tagfunction}^{role, τ} [env, {pm}_{role [\tilde{a}]}^{'}] ()$ , then the initialization function of role $role$ is executed. This role is removed from $R_{init - closure} ({th}^{s})$ , and the corresponding oracles are added to $O_{call} ({th}^{s})$ and to $O_{call - repl} ({th}^{s})$ . We prove that the relation $C^{cs} \equiv C T$ is preserved.

The other cases are straightforward since the simulator mimics the OCaml semantics. They all preserve the relation $C^{cs} \equiv C T$ . □

From Lemmas 8.34 and 8.35, we can prove the following proposition, by extending the traces using Lemma 8.35 until we get complete traces.
Proposition 8.36.
Let ${C T}_{1}, \dots, {C T}_{n}$ be complete CryptoVerif traces starting at $C_{0} (Q_{0}, {program}_{0})$ .

Then there exist disjoint sets of complete OCaml traces ${C T S}_{1}, \dots, {C T S}_{n}$ all starting at $C_{0} (Q_{0}, {program}_{0})$ such that for all $i ⩽ n$ , $Pr [{C T}_{i}] = Pr [{C T S}_{i}]$ , and if $C$ is the last configuration of ${C T}_{i}$ and $C$ is the last configuration of a trace in ${C T S}_{i}$ , then the event list $E$ of $C$ and the event list $events$ of $C$ satisfy $events = G_{ev} (E)$ .

We prove this proposition in more detail in Appendix G (see the Supplemental material). As an immediate consequence of this proposition, we obtain the following proposition. Proposition 8.37.
$Pr [C_{0} (Q_{0}, {program}_{0}) :^{(⇝)} D] = Pr [C_{0} (Q_{0}, {program}_{0}) :^{(ML)} D]$ .

8.4.3. Security result

By combining Propositions 8.19 and 8.37, we obtain the following theorem.

Theorem 8.38 (Security result).

$Pr [C_{0} (Q_{0}, {program}_{0}) :^{(CV)} D] = Pr [C_{0} (Q_{0}, {program}_{0}) :^{(ML)} D] .$

In other words, the adversary ${program}_{0}$ against our generated OCaml modules has the same probability of breaking the security property as the adversary $Q_{adv} (Q_{0}, {program}_{0})$ against the CryptoVerif process.

CryptoVerif bounds the probability that an adversary Q breaks the security property D, that is, it finds a probability p that depends on the adversary such that, for all CryptoVerif adversaries Q for $Q_{0}$ , $Pr [C_{i} (Q_{0} ∣ Q) :^{(CV)} D] ⩽ p .$ The adversaries $Q_{adv} (Q_{0}, {program}_{0})$ are CryptoVerif adversaries for $Q_{0}$ , so for all OCaml programs $program$ that obey our assumptions, $Pr [C_{0} (Q_{0}, program) :^{(ML)} D] = Pr [C_{0} (Q_{0}, program) :^{(CV)} D] ⩽ p .$ Hence, all considered OCaml adversaries $program$ can break the security property D with probability at most p.

The probability bound p returned by CryptoVerif is a function that depends on many parameters, expressed on the CryptoVerif protocol specification. Let us relate these parameters to the OCaml implementation. These parameters are as follows:

The maximum number of times the various oracles and roles have been called, $N_{O}$ and $N_{role}$ . As shown by our proof and by Definition 8.11, $N_{O}$ can be set to the maximum number of calls to the same closure representing oracle O in any trace of the OCaml program, and $N_{role}$ can be set to the maximum number of instantiations of the role $role$ in any trace of this program.

The size of the CryptoVerif types T. The corresponding OCaml type $G_{T} (T)$ is fixed by the annotations of the CryptoVerif specification. The size of T can be set to the size of $G_{T} (T)$ . Similarly, the size of the CryptoVerif values a (used when their type T has unbounded size) can be set to the size of the corresponding OCaml value $G_{val T} (a)$ .

The execution time of the cryptographic primitives and of various CryptoVerif constructs. This time can be set to the execution time of the corresponding OCaml implementation.

The execution time of the adversary. Our proof shows that the function ${simulate}_{ML}$ executes at most as many reduction steps as the OCaml adversary. However, the CryptoVerif adversary shown in Fig. 18 also includes additional steps and conversions between the OCaml semantic configuration and its CryptoVerif bitstring representation. By using the contents of the OCaml memory as bitstring representation of the semantic configuration in CryptoVerif, we can obtain an efficient implementation of the CryptoVerif adversary that does not take significantly more time than the OCaml adversary.

From the probability bound given by CryptoVerif, we can then obtain a bound on the probability of breaking the security properties in the generated OCaml implementation of the protocol.

Example 8.39.
For the protocol $Q_{0}$ of Example 4.1, using Theorem 8.38 and the probability bound computed by CryptoVerif in Example 4.9, we obtain that our generated implementation satisfies $Pr [C_{0} (Q_{0}, program) :^{(ML)} D_{c}] ⩽ {Succ}_{sign}^{uf - cma} (t + (N_{2} - 1) t_{check}, N_{1}),$ where t is the execution time of the adversary $program$ , $t_{check}$ is the maximum execution time of a call to the implementation of $check$ , $N_{1}$ is the maximum number of calls to oracle $OA$ , $N_{2}$ is the maximum number of calls to oracle $OB$ , and ${Succ}_{sign}^{uf - cma} (t^{'}, n^{'})$ is the probability of forging a signature in time $t^{'}$ with at most $n^{'}$ calls to the signature oracle.

As detailed in [10], CryptoVerif shows that our model of the SSH Transport Layer Protocol guarantees the authentication of the server to the client and the secrecy of the session keys. By Theorem 8.38, our generated implementation of this protocol satisfies the same properties, provided Assumptions (A1)–(A7) hold.
9. Conclusion

We have proved that our compiler preserves security. Therefore, by using CryptoVerif, we can prove the desired security properties on the protocol specification, and then by using our compiler, we get a runnable implementation of the protocol, which satisfies the same security properties as the specification. Making such a proof is also useful because it clarifies the assumptions needed to ensure that the implementation is secure (Assumptions (A1)–(A7) in our case). The proof technique presented in this paper, simulating any adversary by a CryptoVerif process, is also useful to show that any Turing machine can be encoded as a CryptoVerif adversary, which is important for the validity of the verification by CryptoVerif.

Our approach could obviously be used to generate implementations in languages other than OCaml. It should not be difficult to adapt our compiler to another language. The structure of the proof should also remain the same, but obviously the details will need to be adapted to the semantics of each programming language. In a target language such as C, closures that we use to represent oracles could be represented by records containing a function pointer. Since C does not guarantee memory safety, an additional analysis of the network code should be performed to make sure that it does not access private data of our generated code. To simplify the analysis, one may require that the generated code and the network code belong to a clean subset of C. One might also go all the way to the generation of certified machine code, by using a certified compiler, as in [4].

Extending the specification language of CryptoVerif, for instance with loops and mutable data structures, would be helpful to implement real, complex protocols. The main difficulty in this task does not lie in the generation of implementations, but in the extension of the prover CryptoVerif itself. Formalizing our manual proof using a proof assistant (e.g., Coq) would also be interesting future work. We believe that our detailed proof will be a good starting point for that. It would also be interesting to extend our approach to support side channel attacks, such as timing attacks and power consumption attacks. Protection against such attacks is important in practical protocols.

Supplemental material

Online supplement consisting of Appendices is available at: https://dx-doi-org.web.bisu.edu.cn/10.3233/JCS-150524

Appendices

Footnotes

Acknowledgments

This work was partly done while the authors were at École Normale Supérieure, Paris. It was partly supported by the ANR project ProSe (decision ANR 2010-VERS-004). The authors thank the reviewers for their helpful comments on a previous version of this paper.

References

[1]http://research.microsoft.com/en-us/projects/cvk/.

[2]

Aizatulin,

A.D.

Gordon and

Jürjens, Extracting and verifying cryptographic models from C protocol code by symbolic execution, in: CCS’11, ACM, New York, 2011, pp. 331–340.

[3]

Aizatulin,

A.D.

Gordon and

Jürjens, Computational verification of C protocol implementations by symbolic execution, in: CCS’12, ACM, New York, 2012, pp. 712–723.

[4]

J.B.

Almeida,

Barbosa,

Barthe and

Dupressoir, Certified computer-aided cryptography: Efficient provably secure machine code from high-level implementations, in: ACM SIGSAC Conference on Computer and Communications Security (CCS’13), Berlin, Germany, November 2013, ACM, 2013, pp. 1217–1230.

[5]

Bengtson,

Bhargavan,

Fournet,

Gordon and

Maffeis, Refinement types for secure implementations, ACM TOPLAS 33(2) (2011), Article No. 8.

[6]

Bhargavan,

Fournet,

Gordon and

Tse, Verified interoperable implementations of security protocols, ACM TOPLAS 31(1) (2008), Article No. 5.

[7]

Blanchet, A computationally sound mechanized prover for security protocols, IEEE Transactions on Dependable and Secure Computing 5(4) (2008), 193–207.

[8]

Blanchet, Automatically verified mechanized proof of one-encryption key exchange, in: CSF’12, IEEE, Los Alamitos, 2012, pp. 325–339.

[9]

Blanchet and

Pointcheval, Automated security proofs with sequences of games, in: CRYPTO’06, LNCS, Vol. 4117, Springer, 2006, pp. 537–554.

10.

[10]

Cadé and

Blanchet, From computationally-proved protocol specifications to implementations and application to SSH, Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications (JoWUA) 4(1) (2013), 4–31.

11.

[11]

Cadé and

Blanchet, Proved generation of implementations from computationally-secure protocol specifications, in: 2nd Conference on Principles of Security and Trust (POST 2013), Rome, Italy, March 2013,

Basin and

Mitchell, eds, LNCS, Vol. 7796, Springer, 2013, pp. 63–82.

12.

[12]

Chaki and

Datta, ASPIER: An automated framework for verifying security protocol implementations, in: CSF’09, IEEE, Los Alamitos, 2009, pp. 172–185.

13.

[13]

Corin and

F.A.

Manzano, Efficient symbolic execution for analysing cryptographic protocol implementations, in: Engineering Secure Software and Systems (ESSoS’11), Madrid, Spain, February 2011,

Ú.

Erlingsson,

Wieringa and

Zannone, eds, LNCS, Vol. 6542, Springer, 2011, pp. 58–72.

14.

[14]

Dupressoir,

A.D.

Gordon,

Jürjens and

D.A.

Naumann, Guiding a general-purpose C verifier to prove cryptographic protocols, in: CSF’11, IEEE, Los Alamitos, 2011, pp. 3–17.

15.

[15]

Fournet,

Kohlweiss and

P.-Y.

Strub, Modular code-based cryptographic verification, in: CCS’11, ACM, New York, 2011, pp. 341–350.

16.

[16]

Milicia, χ-spaces: Programming security protocols, in: NWPT’02, 2002.

17.

[17]

Owens, A sound semantics for OCaml light, in: ESOP’08,

Drossopoulou, ed., LNCS, Vol. 4960, Springer, Heidelberg, 2008, pp. 1–15.

18.

[18]

Owens,

Peskine and

Sewell, A formal specification for OCaml: The core language, 2008, available at: http://www.cl.cam.ac.uk/~so294/ocaml/caml_typedef.pdf.

19.

[19]

Pironti and

Sisto, Provably correct Java implementations of Spi Calculus security protocols specifications, Computers and Security 29(3) (2010), 302–314.

20.

[20]

Swamy,

Chen,

Fournet,

P.-Y.

Strub,

Bharagavan and

Yang, Secure distributed programming with value-dependent types, in: ICFP’11, ACM, New York, 2011, pp. 266–278.

Proved generation of implementations from computationally secure protocol specifications 1

Abstract

Keywords

1. Introduction

2 From version 4.02, OCaml has a command-line option that makes string immutable.

Related work.

Fonts.

4. The CryptoVerif input language

4.1. Syntax and informal semantics

Definition 4.8 (Traces).

5.1. Syntax and informal semantics

5.2.1. Pattern matching

5.2.5. Toplevel reduction

5.2.7. Equivalence modulo renaming of locations

6. Instrumentation of the OCaml semantics

8.1. Correctness of cryptographic primitives

Proposition 8.5 (Correct behavior of the primitives).

8.2. Correctness of the translation of oracle bodies

Definition 8.6 (CryptoVerif table to OCaml list).

Definition 8.7 (Minimal environment and global store).

Lemma 8.8 (Term reduction).

Definition 8.9 (Helper functions).

Lemma 8.10 (Inner reduction).

8.3. Simulation of the OCaml adversary

Definition 8.14 (Random simulation).

Definition 8.15 (Processes).

8.4.1. Intermediate semantics

Definition 8.20 (Concretization of I and R I ).

Definition 8.21 (Oracle sets).

Definition 8.22 (Oracles have distinct status).

Definition 8.27 (Correct closure).

Definition 8.28 (Replace call ).

Definition 8.29 (Token part of the store).

Definition 8.30 (Relation between simulator and OCaml threads).

Theorem 8.38 (Security result).

Supplemental material

Footnotes

Acknowledgments

References

²
From version 4.02, OCaml has a command-line option that makes string immutable.

Definition 8.20 (Concretization of $I$ and $R I$ ).

Definition 8.28 (Replace $call$ ).