Abstract
In order to obtain implementations of security protocols proved secure in the computational model, we previously proposed the following approach: we write a specification of the protocol in the input language of the computational protocol verifier CryptoVerif, prove it secure using CryptoVerif, then generate an OCaml implementation of the protocol from the CryptoVerif specification using a specific compiler that we have implemented. However, until now, this compiler was not proved correct, so we did not have real guarantees on the generated implementation. In this paper, we fill this gap. We prove that this compiler preserves the security properties proved by CryptoVerif: if an adversary has probability p of breaking a security property in the generated code, then there exists an adversary that breaks the property with the same probability p in the CryptoVerif specification. Therefore, if the protocol specification is proved secure in the computational model by CryptoVerif, then the generated implementation is also secure.
Introduction
The verification of security protocols is an important research area since the 1990s: the design of security protocols is notoriously error-prone, and errors can have serious consequences. Formal verification first focused on verifying formal specifications of protocols. However, verifying a specification does not guarantee that the protocol is correctly implemented from this specification. It is therefore important to make sure that the implementation is secure, and not only the specification. Moreover, two models were considered for verifying protocols. In the symbolic model, the so-called Dolev–Yao model, messages are terms. This abstract model facilitates automatic proofs. In contrast, in the computational model, typically used by cryptographers, messages are bitstrings and attackers are polynomial-time probabilistic Turing machines. Proofs in the latter model are more difficult than in the former, but yield a much more precise analysis of the protocol. Therefore, we would like to obtain implementations of protocols proved secure in the computational model.
To reach this goal, we proposed the following approach in [10]. We start from a formal specification of the protocol. In order to prove the specified protocol secure in the computational model, we rely on the automatic protocol verifier CryptoVerif [7–9]. This verifier can prove secrecy and authentication properties. The generated proofs are proofs by sequences of games, like the manual proofs written by cryptographers. These games are formalized in a probabilistic process calculus. The specification of the protocol given as input to CryptoVerif then consists of a process representing the protocol to prove (the initial game of the proof), assumptions on the cryptographic primitives (such as “encryption is IND-CPA” and “decrypting a ciphertext with the correct key yields the initial cleartext”), and the security properties to prove. CryptoVerif then looks for a proof of the desired security properties, and when it finds one, it also provides a formula that bounds the probability of success of an attack against the desired properties as a function of the runtime of the adversary, the number of sessions of the protocol, and the probability of breaking each primitive.
In order to obtain a proved implementation from the specification, we have written a compiler that takes a CryptoVerif specification and returns an implementation in the functional language OCaml (http://caml.inria.fr). This compiler starts from a CryptoVerif specification annotated with implementation details. The annotations specify how to divide the protocol in different roles, for example, key generation, server, and client, and how to implement the various cryptographic primitives and types. They also specify which CryptoVerif variables should be written into files, because they are communicated from one role to another. For instance, the key generation typically writes long-term keys into files, so that they can be used by subsequent roles. The compiler then generates an OCaml module for each role in the input file. In order to get a full implementation of the protocol, this module is combined with manually written code, responsible in particular for sending and receiving messages from the network, which we call the network code. For instance, in the case of the client–server protocol, both the client and server programs consist of a mix of our generated modules, which deal with the heart of the cryptographic protocol, and manually written network code, which deals with non-cryptographic details.
To make sure that the generated implementation is actually secure, we need to prove the correctness of our compiler. This proof was still missing in [10]. It is the topic of this paper. To make this proof, we need a formal semantics of OCaml. We adapt the operational small-step semantics of a core part of OCaml by Owens et al. [17,18]. We add to this language support for simplified modules, multiple threads where only one thread can run at any given time, and communication between threads by a shared part of the store.
An adversary against the generated implementation is an OCaml program using the modules generated by our compiler. On the CryptoVerif side, an adversary is a process running in parallel with the verified protocol. In our proof, for each OCaml adversary, we construct a corresponding CryptoVerif adversary that simulates the behavior of the OCaml adversary. When the OCaml adversary calls one of the functions generated by our compiler, which comes from an oracle in the CryptoVerif process, the CryptoVerif adversary calls this oracle. Then we establish a precise correspondence between the traces of the CryptoVerif process with that CryptoVerif adversary and the traces of the OCaml program. This correspondence allows us to show that the probability of success of an attack is the same on the CryptoVerif side and on the OCaml side. Therefore, if CryptoVerif proves that the protocol is secure, then the generated OCaml implementation is also secure, and the bound on the probability of success of an attack computed by CryptoVerif is also valid for the implementation.
We have made several assumptions to obtain this proof; the most important ones are:
The random number generator used by the OCaml cryptographic library is perfect.
The implementation of each cryptographic primitive is a pure function and satisfies the assumptions made on it in the specification.
The roles are executed in the order specified in CryptoVerif (e.g., in a key-exchange protocol, the key generation is called before the servers and clients).
The adversary and the network code do not access files created by our implementation (e.g. private key files).
The network code is a well-typed OCaml program, which does not use unsafe OCaml functions to bypass the type system.
The network code does not mutate data passed to or received from generated code. This property can be guaranteed by representing such data by immutable OCaml types. However, such data includes bitstrings and the most natural type for representing bitstrings is the OCaml type
From version 4.02, OCaml has a command-line option that makes
Our semantics of threads is obeyed, which implies that only one thread can run at any given time and that one cannot fork in the middle of a role.
Because the network code and our generated modules run inside the same programs, we use assumptions (A5) and (A6) to make sure that the network code does not interfere with the generated code. In particular, Assumption (A5) prevents the network code from accessing the variables contained in the environment of functions returned by our generated code. These variables may contain secret keys, which the network code could send to the adversary if it had access to them. Moreover, our generated code may return both a public key and a function that includes this public key in its environment. If the network code could modify the returned public key, it would modify the key used by the function as well, so the protocol would use an unexpected public key. Assumption (A6) avoids that. Assumptions (A5) and (A6) are the only requirements on the network code needed to prove security so, except for these two assumptions, we consider the network code as part of the adversary. In Assumption (A7), the requirement that only one thread can run at any given time can be weakened as we discuss informally at the end of Section 5.2.5: the essential requirement is that two processes that read or write the same file are not run concurrently, which can be enforced using locks. Assumption (A7) also limits forking: forking is allowed when the local store is empty. In case one needs to fork in the middle of a role, one can split the role into two, which has the effect of transmitting the store via files between the two roles. It may also be possible to extend our result with an explicit fork instruction in the OCaml language.
Assumptions (A1), (A5) and (A7) are built into our semantics of OCaml, defined in Section 5. Assumption (A2) is formalized below by Assumption 8.4, with additional technical details formalized in Assumptions 8.1 and 8.2. Assumptions (A3), (A4) and (A6) are formalized by Assumptions 6.1, 7.2 and 8.3, respectively.
In this work, we do not consider side-channel attacks, such as timing and power consumption attacks, nor physical attacks. Like other mechanized tools for cryptographic proofs, CryptoVerif does not deal with these attacks.
Related work.
Several approaches have been considered in order to obtain proved implementations of security protocols. In the symbolic model, several approaches generate protocols from specifications, e.g. [16,19]. Other approaches analyze implementations by extracting a specification verified by a symbolic protocol verifier, e.g. [2,6], or analyze them by other tools such as the model-checker ASPIER [12], the general-purpose C verifier VCC [14], symbolic execution [13] or typing [5,20].
In contrast, the following approaches provide computational security guarantees, by analyzing implementations. The tool FS2CV [1] translates a subset of F# to the input language of CryptoVerif, which can then prove the protocol secure. The tool F7 [5], which uses a dependent type system to prove security properties on protocols implemented in F#, has been adapted to the computational model in [15]; it uses type annotations to help the proof. The symbolic execution approach of [2] provides computational security guarantees by applying a computational soundness result, which however restricts the class of protocols that can be considered. The tool of [3] generates a CryptoVerif model from a C implementation; however, it can analyze only a single execution path.
Recently, Almeida et al. [4] introduced a new approach for generating implementations with a computational proof. They extend the cryptographic prover EasyCrypt to support C-like programs, then they generate proved assembly code using an extended version of the CompCert certified C compiler. They mainly target cryptographic primitives (for instance, OAEP), and using EasyCrypt requires the user to give the games of the cryptographic proof, while in our approach CryptoVerif generates them.
To the best of our knowledge, our approach and that of [4] are the only ones for generating implementations with a computational proof. References [3,4] and our work are the only ones to provide an explicit bound on the probability of success of an attack against the verified protocol implementation.
In order to prove the correctness of a compiler, we first need a formal semantics of the source and target languages, and a formal definition of the compiler. Handling all this formalism is probably the main challenge of this paper; it explains its length.
After introducing some notations (Section 3), our first task is to formally define the common input language of CryptoVerif and of our compiler (Section 4). We define the semantics of this language as a probabilistic transition system on semantic configurations. CryptoVerif uses events to define security properties. For instance, a security property may be “if event
Section 5 defines the OCaml language. We rely on the operational small-step semantics of a core part of OCaml by Owens et al. [17,18], which we adapt to our setting. We add a primitive for random choices, which makes the semantics probabilistic. We also add support for simplified modules, multiple threads, and communication between threads by a shared part of the store. We adopt a simplified model of parallelism: only one thread runs at a time, and the adversary is in charge of scheduling. This model of parallelism is close to what happens in CryptoVerif; we explain informally why it is sufficient for our purpose in Section 5. Like the semantics of the CryptoVerif input language, the semantics of OCaml is defined as a probabilistic transition system on semantic configurations.
In order to prove our compiler, we instrument OCaml code in three ways (Section 6). We add events to the language, so that we can specify security properties in OCaml as we do in CryptoVerif. We introduce tagged functions and closures, which have the same semantics as ordinary functions and closures, but contain additional tags used in our code generation to indicate from which role or oracle the function comes. Each CryptoVerif role is translated by our compiler into an OCaml module; we add to the OCaml semantics a multiset of callable modules, which indicates which modules can be called to guarantee that only allowed roles are executed, as required by Assumption (A3). We show that this instrumentation does not alter the semantics of OCaml: an instrumented program behaves exactly in the same way as that program with the instrumentation deleted, provided only allowed roles are executed. More precisely, we show a weak bisimulation between the non-instrumented and the instrumented semantics.
Section 7 defines the translation from CryptoVerif to OCaml. In this translation, each role generates a module, and the oracles are represented by closures. Basically, the translation implements in OCaml the semantics of CryptoVerif given in Section 4. The translation is the same as the one given [10], except that the generated OCaml code is instrumented. The generated modules are combined with manually written network code (which is in particular responsible for inputting and outputting messages on the network) to produce the complete programs that implement the protocol. These programs are run in interaction with an adversary, which we also represent by an OCaml program. This is possible because OCaml with random choices is probabilistic Turing complete. The code generated from the CryptoVerif process
Section 8 proves the correctness of this compiler. Informally, when CryptoVerif shows that

Overview of our proof.
The presence of an arbitrary adversary complicates the proof. As illustrated in Fig. 1 and detailed in Section 8.3, to solve this problem, we build from the OCaml adversary defined in
The initial CryptoVerif configuration is then
To prove (1), we basically need to show that First, we state our assumptions on the implementation of the cryptographic primitives, and show that the primitives behave correctly independently of the rest of the program (Section 8.1). Second, we prove that the OCaml translation of a CryptoVerif oracle behaves like the oracle (Section 8.2). Finally, in Section 8.4, we prove that the CryptoVerif adversary interacting with
Therefore, we obtain the desired proof of (1) (Theorem 8.38). Because of the length of this proof, details are postponed to the Appendices (see the Supplemental material). An index of notations can be found in Appendix H (see the Supplemental material).
Let us introduce some basic notations. When f is a function, we denote by
We use the following notations for lists. Let
We define the function
Fonts.
We use a sans-serif font for CryptoVerif keywords (e.g.,
The CryptoVerif input language
This section presents the syntax and semantics of the CryptoVerif input language, as well as the annotations that specify implementation details. CryptoVerif supports two input languages: the channel and oracle front-ends. The channel front-end [7] uses channels to pass data between the adversary and the protocol, and the oracle front-end [9] defines oracles that can be called by the adversary. In this paper, we focus on the oracle front-end, which is closer to the syntax of games used by cryptographers; oracles are also easier to translate into OCaml functions. (Our compiler also supports the channel front-end.) We adapt the semantics given in [7] for the channel front-end to the oracle front-end.
Syntax and informal semantics

Syntax of the CryptoVerif language.
Let us first introduce the syntax of the CryptoVerif language in Fig. 2. The language is typed, and types T are subsets of
Variables
The oracle definitions Q represent the oracles that will become available to the adversary at this point. The nil construct 0 provides no oracle. The parallel composition
The oracle bodies P represent the behavior of the oracle. A return statement
An insert statement
An oracle call
A loop
Let us consider a simple protocol in which the first participant Alice generates a nonce m, sends it to the second participant Bob with a signature of the nonce under Alice’s signature key
The oracle
The oracle
The goal of this protocol is to guarantee that, with high probability, if B accepts a nonce
The previous toy example is not very realistic, in particular because B accepts messages only from A. In a more realistic setting, B could be a server that would process messages coming from several different clients. B would then use a table of keys to relate the identity of each client to its public key. We would then use the following process:
Tables of keys appear in many realistic protocols. For instance, the SSH client stores a table that contains the public keys and the names of the servers it connected to.

Semantics (1).
We present the semantics of the language in Figs 3 and 4. The semantics is defined as a reduction relation on semantic configurations, which are tuples of the form
The environment E is a mapping from array cells
The oracle body P is the oracle body currently running.
The mapping
The set
The list
The list
During execution, terms may be reduced into constant bitstrings, so we add constant bitstrings a to the grammar of terms M. The notation

Semantics (2).
The semantics is defined by probabilistic reduction rules between configurations:
The rule (New) evaluates
The rules (Insert), (Get1) and (Get2) deal with tables of keys. The rule (Insert) evaluates the inserted element and adds it to the table
The rule (Call) implements the oracle call
The rule (Return) pops an element
The rule (End) also pops an element
The rules (Loop1) and (Loop2) implement the
The initial configuration for running the oracle definition
CryptoVerif verifies the following requirements on
Variables are renamed so that each variable has a single definition. The indices
Property 4.3 makes sure that a distinct array cell is used in each copy of a process, so that all values of the variables during execution are kept in memory. (This helps in cryptographic proofs.)
The processes are well typed. (In particular, functions and oracles receive arguments of their expected types. For brevity, we do not detail the type system; see [7] for a similar type system.)
Property 4.4 requires the adversary to be well typed. This requirement does not restrict its computing power, because well-typed processes are Turing-complete, since primitives can implement any deterministic Turing machine. The type system also does not restrict the class of protocols that we consider, since the protocol may contain type-cast functions
We define types of oracles as follows. The type of a
An oracle may have several
Property 4.5 guarantees that the various definitions of an oracle are consistent, and can in fact be compiled into a single function in OCaml. The oracles at the beginning of Q are the oracles found in Q without recursively looking into oracle definitions.
Oracles with the same name can be defined only in different branches of an
Property 4.6 guarantees that there exists a single callable definition for each oracle. This property is formalized by the following lemma, proved in Appendix A (see the Supplemental material).
If the configuration
This lemma proves that the rule (Call) is deterministic. Therefore, all rules are deterministic, except the rules (New) and (Get1) which may make probabilistic choices.
As a consequence, if a configuration
(Traces).
Let us denote traces with the symbol
A complete trace is a trace whose last configuration is blocking.
The probability of the trace
The notation
The notation
The notation
Intuitively, when no trace in
In CryptoVerif, since for every reduction with a probabilistic choice, the environment E is modified so that we can determine from E which reduction was used, and one cannot remove elements from E, there will be at most one trace from one configuration to another. However, the notations of Definition 4.8 are also used for OCaml where there could be several configurations reducing to the same configuration, so they support this situation.
Finally, the security properties are defined using distinguishers D which are functions that take a list of events
To show that the protocol
We can also define secrecy using events and distinguishers [8].
In order to compile a CryptoVerif process into an implementation, we added annotations to the language, to specify implementation details.
First, we separate the parts of the process that correspond to different roles, such as client and server, which will be included in different OCaml programs in the generated implementation. We annotate processes to specify roles: the beginning of a role
The process for a role
Let us annotate the protocol of Example 4.1.
The role
Finally, the user annotations provide, for each CryptoVerif type T, the corresponding OCaml type The function The serialization function The predicate function
The user annotations also provide, for each CryptoVerif function
CryptoVerif verifies the following properties.
There is a single occurrence of each role
This property guarantees that we know which process to compile for a given role, and which roles start after the return from a given oracle.
There are no nested roles.
Furthermore, for simplicity, we also assume the following points.
All oracle definitions are included in a role.
This assumption is relaxed in the implementation: we accept all processes in which all oracles in a role are not preceded by oracles not in a role. In practice, oracles outside a role serve in representing features, such as corruption of protocol participants or registration of dishonest participants, that are needed in the proof of the security property but not in the implementation of the protocol. For instance, in Example 4.2, the oracle
No replication occurs above a parallel composition or a replication. When the definition of a role
A process can be transformed so that no replication occurs above a parallel composition by distributing the replications into the parallel compositions:
By Properties 4.6, 4.5 and 4.11, there cannot be, in the same process, a definition of an oracle O directly under replication and another definition of the same oracle O not directly under replication. Hence, we can use the phrase “O is under replication” unambiguously. Moreover, by Property 4.5, the bound of the replication above a definition of an oracle O is the same for all definitions of O.
For each oracle O under replication, we let
After transforming the process so that it satisfies Assumption 4.14, we can transform it into a process that satisfies Assumption 4.15 by renaming the bounds of replications above distinct roles or oracles to distinct bounds. For instance,
Assumptions 4.14 and 4.15 are relaxed in our implementation: we warn the user when the process does not satisfy them, but we accept the process. Not heeding these warnings will lead to CryptoVerif returning imprecise, but sound, probabilities of security. We use these assumptions because they simplify the proof without losing much generality. Our result can be extended to the general case as follows: if the process
This section presents the OCaml language, the target language of our compiler, by giving its syntax and semantics. We omit some constructs, such as loops and type constructors, which are not used by our compiler. The subset that we consider is still Turing complete, so we do not lose expressivity by removing these constructs. To define the formal semantics, we adapted the small step operational semantics of the core part of OCaml by Scott Owens et al. [17,18].
Syntax and informal semantics
Figure 5 summarizes the syntax of our subset of OCaml. For brevity, we ignore types in this syntax.

OCaml syntax.
Pattern-matching is a central feature of OCaml. A pattern
The basic operations of the language are implemented by primitives
Most expressions are standard. Constants c can be integers, strings, boolean values
Closures are not present in the initial program, but they serve to represent functional values internally. The closure
A security protocol typically involves several programs running in parallel on different machines. We model this situation by considering several threads. To manage threads, we introduce two new expressions,
We define the list expression
A program is a list of top level definitions d, or the raising of an exception. We omit the final ε in a sequence of definitions when it is not empty.
Expressions reduce into values or exceptional values. As summarized in Fig. 6, the values v are functional values like closures, constants c, locations l, and tuples and lists of values. An exceptional value is

OCaml values.
We define step by step the semantics of the various constructs of the language.
Pattern matching
We define the predicate

Matches predicate.
The semantics of primitives is defined in Fig. 8. This semantics is defined by rules of the form

Rules for OCaml primitives.
The semantics of [17,18] substitutes variables with their values. Instead, we define an environment
For example, we evaluate the argument of applications first, and when it becomes a value, we evaluate the function, so

Rules for expressions.

Rules for expressions (continued).

Rules for programs.
Hence, we evaluate expressions and programs by reducing triples
Let us present as an example the reduction of a simple program in an empty environment and an empty stack:
The expressions

Store rules.
As usual, the contents of locations are stored in a
Toplevel reduction
As mentioned in Section 5.1, and in contrast to [17,18], we consider several threads running in parallel. Each thread has a configuration

Top level rules.
The reduction rules for semantic configurations
Second, the relation
Finally, the relation
Splitting the definition of the semantics into three relations allows us to lighten notations in proofs: we can use the reduction on a thread, or on a thread and the global store, without mentioning the other components when they are not affected.
The construct
First, it is sufficient to represent all program executions under the weaker assumption that two threads that read or write the same file are not run concurrently. Indeed, two oracles can interfere with each other only through files, and such interferences are forbidden by this assumption. Hence, by swapping execution steps, a trace that obeys this assumption with any interleaving of the oracle calls can be transformed into an equivalent trace in which the oracle calls are never interrupted, that is, a trace that can be scheduled in our semantics.
Second, it resembles the CryptoVerif semantics, which also has a single active thread and processes one oracle call after the other. This point facilitates the proof of our compiler.
OCaml programs typically contain several modules. We adopt a very simple model of modules. A module named μ consists of an OCaml program
Such a program is run by using the previous reduction rules from the initial configuration
Although we ignore types is our syntax, we suppose that our OCaml programs are well typed, which is checked by the OCaml compiler, and we use the guarantee that well-typed programs do not go wrong: a program stops only when the current thread has been reduced into the empty definition list or an exception
Equivalence modulo renaming of locations
The rule (Store alloc) is non-deterministic, since the new location l can be any unused location in
We will also use notations similar to Definition 4.8 for the OCaml semantics. We denote by
Instrumentation of the OCaml semantics
In order to prove the correctness of our compiler, we instrument OCaml code in three ways; this section details this instrumentation and proves that it does not alter the semantics of OCaml.

Semantics of tagged functions.
First, we add a new kind of functions and closures
Second, we need to be able to match CryptoVerif events, so we add to the semantic configuration an element

Updated toplevel rules for the instrumented semantics.
Third, the roles of a CryptoVerif process cannot be executed in any order: if a role is defined after the return from an oracle, it can be executed only after the previous oracle has returned. For instance, we can run a server only after generating its keys. We need to enforce this constraint also in the OCaml program. Each CryptoVerif role
We also add the expression
Let us now show that this instrumentation does not alter the semantics of OCaml: an instrumented program behaves exactly in the same way as that program with the instrumentation deleted, provided only allowed roles are executed, as assumed by Assumption (A3). This assumption is formalized as follows:
The instrumented
We first show that, when a program or expression is a value v or an exceptional value
We define the equivalence
We extend this equivalence to non-instrumented configurations
We first show that configurations equivalent by
If
We prove this lemma in Appendix B (see the Supplemental material).
Let us now define the function
The function every every and all
in this thread.
The function removing any pair of the form and transforming each pair of the form
Let
Finally, let us define
We do not need to replace elements of the global store, as they cannot contain closures:
The next proposition shows that, with Assumption 6.1, there is a weak bisimulation between the non-instrumented semantics and the instrumented semantics, that is, the reductions match in the two semantics, but the number of steps may differ. Indeed, the
If
If
In the rest of the paper, we use only the instrumented semantics. Furthermore, we denote instrumented configurations by
In this section, we describe how our compiler translates an annotated CryptoVerif process. It translates each CryptoVerif role
Let us recall that the function
Before defining the translation of an oracle, let us first introduce some notations. For each CryptoVerif variable x, we denote by
We say that an oracle or role definition occurs at the beginning of Q when it is found in Q just under replication or parallel composition, without recursively looking into oracle definitions. We define the function
We also define the function

Translation function

Translation of an oracle.
To translate an oracle, we translate the body of the oracle using the function
For the
In the
An oracle
Finally, we generate an OCaml module
Let us explain the translation of the role
This role is translated into the module
This program defines the function
The function
To call the translation of oracle
The generated modules
The locations in
The program
Let us consider the following toy OCaml program
Following the annotations of Example 4.10, this example uses two private global store locations,
Detailing the reductions of this configuration would take too much space, but we still give some information on the configuration obtained after evaluating
The code executed until configuration
This section presents the proof of correctness of our compiler. We give ourselves a CryptoVerif process
In Section 8.1, we state our assumptions on the cryptographic primitives, and show that the primitives behave correctly independently of the rest of the program. In Section 8.2, we prove that the OCaml translation of a CryptoVerif oracle behaves like the oracle. In Section 8.3, we define the CryptoVerif adversary that simulates the OCaml adversary
Correctness of cryptographic primitives
Let us first formalize the assumptions we make about the implementation of cryptographic primitives. Let
There are no
An OCaml semantic configuration in which the current thread does not use
Let
There exists a unique complete thread trace
This assumption means that there are no uncaught exceptions, no access to the store, and no
For each CryptoVerif type T, OCaml values of the corresponding type
This assumption formalizes that data passed to or received from generated code is immutable, as mentioned in Assumption (A6): such data does not contain locations.
To establish the correspondence between CryptoVerif values and OCaml values, we define a function
The next assumption states that the primitives have been correctly implemented, following Assumption (A2): the implementation of the cryptographic primitives in
For each CryptoVerif function f of type For each CryptoVerif type T such that the function For each CryptoVerif type T such that the function For each CryptoVerif type T such that the functions If v is a non-empty list, then for each
Item (1) states that the implementation
In contrast to the conference version [11], in this paper, we allow the cryptographic primitives to use the store for their internal computations (which often happens in practice); the store created by the primitives appears on the right-hand side of reductions in Assumption 8.4. However, we still assume that the cryptographic primitives are pure functions: their usage of the store should not have any visible side effect, so the primitives cannot communicate across calls or communicate data to the adversary or to the rest of the code using the store. This assumption is modeled in Assumption 8.4 by considering that the primitives are initially called in an empty store. Hence, they cannot access pre-existing locations (there are none), and since their return value does not contain locations, the store at the end of the call will be unreachable. We show in Proposition 8.5, that when the primitives are called with a non-empty initial store, the primitives still execute in the same way as with an empty initial store: the only difference is that the unmodified initial store is added to the current store. Therefore, the primitives still do not access the initial store and the part of the store created during the execution of the primitive becomes unreachable when the primitive returns.
In general, when primitives make probabilistic choices, they might return the same result in several traces with a different environment and store. To simplify notations, Assumption 8.4 states that this does not happen, so that we have the same environment and store in all final configurations that yield the same result. Our proof could easily be extended to the general case if desired.
The next proposition shows that the primitives always return correct results, when they are called inside an OCaml program, so possibly with a non-empty store and a non-empty stack. We prove it in Appendix C (see the Supplemental material). It is a consequence of Assumption 8.4.
Let us consider a thread
If
If
If
If
If
If
(Correct behavior of the primitives).
In all cases
Correctness of the translation of oracle bodies
In this section, we show the correctness of the translation of oracle bodies in our compiler: we show a correspondence between the semantics of the oracle body in CryptoVerif and the semantics of its translation into OCaml.
Let
Next, we define the OCaml value corresponding to a CryptoVerif table, and we use this definition to define the OCaml environment and global store corresponding to a CryptoVerif environment and to CryptoVerif tables.
(CryptoVerif table to OCaml list).
Let us consider a table
(Minimal environment and global store).
We define
The
First, we show a correspondence between a CryptoVerif term and its OCaml translation.
(Term reduction).
Let M be a CryptoVerif term of type T
In this lemma, we consider an OCaml thread that evaluates the translation
Let us now introduce some notations that allow us to designate the various parts of OCaml semantic configurations.
(Helper functions).
For an OCaml configuration
The notation
Next, we prove that the CryptoVerif oracle bodies P are correctly translated into OCaml as
(Inner reduction).
Let
The proof of this lemma can be found in Appendix D (see the Supplemental material). This lemma is proved by cases on the process P. We use Lemma 8.8 when we need to evaluate a term. The cases
Simulation of the OCaml adversary
In this section, we show how to simulate in CryptoVerif any OCaml program

The program
In more detail, from the OCaml program When the OCaml program calls an oracle, When the OCaml program chooses a random bit, When the OCaml program terminates,
The rest of this section is devoted to the formal definition of all elements used in Fig. 18.
We assume that the OCaml program
The number of calls to the closure with tag O, τ in a trace
The number of executions of role
The number of random number generations in a trace
We define
While
In Fig. 18, we use a
Let
The adversary is mainly encoded by the function
When we call an oracle or instantiate a role under replication, we must choose an unused replication index for this replication, and call the oracle or instantiate the role with that replication index. In this simulation, we will always choose the smallest replication index that has not been used yet, so that the used indices form an interval
More precisely, the set of the form or of the form
The set
The set of the form or of the form
The set
Next, we define functions that manipulate these sets of oracles and roles. We define the subtraction operation If If
We define similarly the subtraction on sets of roles
Let us define the function
Let us consider a process
By Property 4.5, an oracle with a certain name O always takes arguments of the same types and always returns values of the same types. So we can say that the oracle
Let us recall that we denote by
The first oracles of a role
We define

Semantics followed by the simulator.
The syntax of the language of the simulator is almost the same as the language we described in Section 5, with the addition of tagged functions introduced in Section 6. We add the functional values
We present the semantics followed by our simulator in Fig. 19. When we encounter a configuration containing a successful call to an oracle (by
The CryptoVerif function If If If Otherwise, we define
The function
When
The functions
Let us consider a simulator configuration
We define the CryptoVerif function If the returns in oracle O end the current role, then by Property 4.11, there is only one If the returns in oracle O do not end the current role, then let us define In all other cases (that is,
Finally, we define the CryptoVerif function
When the returns in oracle O end the current role, the function
(Random simulation).
We define the CryptoVerif function
Let us finally define the initial state of the simulator. Let
Let us introduce notations for subprocesses of Fig. 18, used in the next example and in Definition 8.32.
(Processes).
We use the following notations:
These notations are useful to represent the CryptoVerif configuration when CryptoVerif calls
Let us consider again the OCaml program
We execute the simulator of Fig. 18 with that value of
At this configuration, the function
In this section, we prove our main security theorem by relating the CryptoVerif and OCaml systems.
Similarly to the definition of
To that order, we first introduce an intermediate semantics for CryptoVerif that decomposes the evaluation of the function
Intermediate semantics
We introduce extended CryptoVerif configurations
Let us define the reduction relation ⇝ such that:
When encountering a configuration
In the next lemma and proposition, we relate traces using ⇝ to traces using →, to prove that all events have the same probability in these two semantics. These results are proved in Appendix E (see the Supplemental material).
Let
If
If
We denote by
We partition the set of complete traces using → and beginning at
In this section, we first define a relation between the intermediate semantics and the OCaml semantics. Then, we prove that this relation holds, which implies that
Since the definition of the relation is fairly complex, we proceed in several steps. We first define an invariant on the simulator configurations, which intuitively means that each oracle is in a single status (possibly available in the future, available for immediate calls, already called) and that oracles available in different threads are distinct. To formalize this invariant, we first define the sets of oracles represented by
(Concretization of
and
).
Let us define the sets of oracles
The definition of
Next, we define several sets of oracles and roles, which allow us to determine which oracles and roles are in which state (callable immediately, available later) in a simulator configuration.
(Oracle sets).
Let
Let
Let
Let
Let
Let
We let
The definition of
The sets
Finally, we can define the desired invariant on simulator configurations.
(Oracles have distinct status).
Let The sets The
To understand how all these oracle sets interact, let us present the flow of an oracle not under replication
Initially, if the oracle occurs at the beginning of the process, it is in
For an oracle occurring at the beginning of a role, when the role containing it is instantiated using
When the initialization function of the role is reduced into a closure, the oracle moves from
When the initialization function of the role is called, the oracle moves from
When the oracle itself is called, it is removed from
The case of an oracle under replication is fairly similar, using
Let us show that the oracles of the initial simulator configuration
Let us also show that the oracles of the simulator configuration
As a second step in our definition of the relation between the intermediate semantics and the OCaml semantics, we define an invariant of the intermediate semantics that shows how the sets
Let us define the sets of oracles If
In contrast to the sets we defined in Definition 8.20, the indices of oracles in
Let us consider the simulator configuration
As a third step, we relate OCaml and simulator threads. To define this relation, we start from a simulator thread. We first replace the simulator role initialization with the OCaml one using the function
The function
This function transforms every occurrence of the tagged closures corresponding to role initialization in the simulator, which are added by the
(Correct closure).
Assume that for an oracle for an oracle for an oracle for an oracle for an oracle
The function
In the case
The case
Oracles under replication cannot disappear from
The case
(Replace
).
Let
The function
(Token part of the store).
Let
The function
In the following definitions, we use the exponent
(Relation between simulator and OCaml threads).
Let There is no closure, no tagged function The following properties hold: There exist There exists an injective function The locations The locations For each tagged closure There is no tagged function
This definition relates the threads of the simulator and of OCaml. A thread can be in one of the following two states. If it satisfies Item (T1), the thread is a protocol thread that was not scheduled yet. The simulator and OCaml threads correspond by transforming the program
We use the notations
The only closure of the form
A similar verification can be done for
Finally, we can define our relation between the intermediate and the OCaml semantics.
Let For For For all locations The oracles of If If
The relation
Item (I3) is an invariant on the CryptoVerif side: it relates the available oracles in
Item (I6) relates the threads of the simulator and of the OCaml semantics, following Definition 8.30.
Items (I7)–(I10) relate the values of the global store in the simulator and in the OCaml semantics. The public part of the global store is the same on both sides (Item (I10)). The private part (files and tables) is empty in the simulator, since this part is handled by CryptoVerif itself (Item (I8)) and cannot be accessed by the adversary (Item (I7)). We require that the private part of the OCaml global store corresponds to the CryptoVerif configuration (Item (I9)).
Item (I11) relates the OCaml multiset of callable modules
Items (I14)–(I17) ensure that we never reach the limits on the number of simulator steps
We verify the relation
Properties (I1) and (I2) are obvious from the form of the configurations, with
The locations
We have
The next two lemmas show that the relation
There exists a trace
Lemma 8.35 shows that the relation
Let
Either there exist n configurations
Or for each trace
We prove these lemmas in Appendix F (see the Supplemental material). Let us present a proof sketch of Lemma 8.35.
Let us take an extended CryptoVerif configuration The current thread of The current thread of Let us first look at the cases in which the configuration If the current expression of If the current expression of Otherwise, the configuration If the current expression of If the current expression of If the current expression of The other cases are straightforward since the simulator mimics the OCaml semantics. They all preserve the relation
From Lemmas 8.34 and 8.35, we can prove the following proposition, by extending the traces using Lemma 8.35 until we get complete traces.
Let
Then there exist disjoint sets of complete OCaml traces
We prove this proposition in more detail in Appendix G (see the Supplemental material). As an immediate consequence of this proposition, we obtain the following proposition.
By combining Propositions 8.19 and 8.37, we obtain the following theorem.
(Security result).
In other words, the adversary
CryptoVerif bounds the probability that an adversary Q breaks the security property D, that is, it finds a probability p that depends on the adversary such that, for all CryptoVerif adversaries Q for
The probability bound p returned by CryptoVerif is a function that depends on many parameters, expressed on the CryptoVerif protocol specification. Let us relate these parameters to the OCaml implementation. These parameters are as follows:
The maximum number of times the various oracles and roles have been called, The size of the CryptoVerif types T. The corresponding OCaml type The execution time of the cryptographic primitives and of various CryptoVerif constructs. This time can be set to the execution time of the corresponding OCaml implementation. The execution time of the adversary. Our proof shows that the function
From the probability bound given by CryptoVerif, we can then obtain a bound on the probability of breaking the security properties in the generated OCaml implementation of the protocol.
For the protocol
As detailed in [10], CryptoVerif shows that our model of the SSH Transport Layer Protocol guarantees the authentication of the server to the client and the secrecy of the session keys. By Theorem 8.38, our generated implementation of this protocol satisfies the same properties, provided Assumptions (A1)–(A7) hold.
We have proved that our compiler preserves security. Therefore, by using CryptoVerif, we can prove the desired security properties on the protocol specification, and then by using our compiler, we get a runnable implementation of the protocol, which satisfies the same security properties as the specification. Making such a proof is also useful because it clarifies the assumptions needed to ensure that the implementation is secure (Assumptions (A1)–(A7) in our case). The proof technique presented in this paper, simulating any adversary by a CryptoVerif process, is also useful to show that any Turing machine can be encoded as a CryptoVerif adversary, which is important for the validity of the verification by CryptoVerif.
Our approach could obviously be used to generate implementations in languages other than OCaml. It should not be difficult to adapt our compiler to another language. The structure of the proof should also remain the same, but obviously the details will need to be adapted to the semantics of each programming language. In a target language such as C, closures that we use to represent oracles could be represented by records containing a function pointer. Since C does not guarantee memory safety, an additional analysis of the network code should be performed to make sure that it does not access private data of our generated code. To simplify the analysis, one may require that the generated code and the network code belong to a clean subset of C. One might also go all the way to the generation of certified machine code, by using a certified compiler, as in [4].
Extending the specification language of CryptoVerif, for instance with loops and mutable data structures, would be helpful to implement real, complex protocols. The main difficulty in this task does not lie in the generation of implementations, but in the extension of the prover CryptoVerif itself. Formalizing our manual proof using a proof assistant (e.g., Coq) would also be interesting future work. We believe that our detailed proof will be a good starting point for that. It would also be interesting to extend our approach to support side channel attacks, such as timing attacks and power consumption attacks. Protection against such attacks is important in practical protocols.
Supplemental material
Online supplement consisting of Appendices is available at: https://dx-doi-org.web.bisu.edu.cn/10.3233/JCS-150524
Footnotes
Acknowledgments
This work was partly done while the authors were at École Normale Supérieure, Paris. It was partly supported by the ANR project ProSe (decision ANR 2010-VERS-004). The authors thank the reviewers for their helpful comments on a previous version of this paper.
