Abstract
Conventional security policies for software applications are adequate for managing concerns on the level of access control. But standard abstraction mechanisms of mainstream programming languages are not sufficient to express how information is allowed to flow between resources once access to them has been obtained. In practice we believe that such control – information flow control – is needed to manage the end-to-end security properties of applications. In this paper we present Paragon, a Java-based language with first-class support for static checking of information flow control policies. Paragon policies are specified in a logic-based policy language. By virtue of their explicitly stateful nature, these policies appear to be more expressive and flexible than those used in previous languages with information-flow support. Our contribution is to present the design and implementation of Paragon, which smoothly integrates the policy language with Java’s object-oriented setting, and reaps the benefits of the marriage with a fully fledged programming language.
Introduction
The general goal of this work is to construct innovative design methods for the construction of secure systems that put security requirements at the heart of the construction process, namely security by design. To do this we must (i) understand how we can unambiguously formulate the policy aims for secure systems, and (ii) develop technology to integrate these goals into design mechanisms and technologies that enable an efficient construction or verification of systems with respect to those policies.
We address this challenge using a programming language-centric approach, presenting a full-fledged security-typed programming language that allows the programmer to specify how data may be used in the system. These security policies are then enforced by compile-time type checking, thus requiring little run-time overhead. Through this we can guarantee that well-typed programs are secure by construction.
But which security policies might we want for our data, and why do we need special support to express them? Certain security policies, for example access control, are relatively easy to express in many modern programming languages. This is because limiting access to resources is something that good programming language abstraction mechanisms are designed to handle. However, access control mechanisms are often a poor tool to express the intended end-to-end security properties that we wish to impose on our applications.
Consider a travel planner “app” which permits you to plan a bus journey, and even add your planned trip to your calendar.1
The example is based on a family of actual Android apps (e.g. de.hafas.android.vasttrafik).
Research on controlling these information flows has progressed over the last decades. In this paper we identify three generations of security properties and control mechanisms for information-flow:
Information-flow control. In the 70’s, Denning and Denning pioneered the idea of certifying programs for compliance with information flow policies based on military-style secrecy classifications [18,19]. They used program analysis to validate that information labelled with a given clearance level could never influence output at any levels lower in the hierarchy – so for example a certified program could never leak top-secret information over a channel labelled as public. The language FlowCAML [41], a variant of ML with information-flow types and type inference, represents the state-of-the-art in support for static Denning-style confidentiality policies.
Beyond mandatory information flow control. Although a rigid, static hierarchy of security levels may be appropriate in a military message-passing scenario, it became quickly apparent that such a strict and static information flow policy is too rigid for modern software requirements. In practice we need a finer-grained and more dynamic view of information flow. The concept of declassification – the act of deliberately releasing (or leaking) sensitive information – is an important example of such a requirement. Without a possibility to leak secrets, some systems would be of no practical use. For example an information purchase protocol reveals the secret information once a condition (such as “payment transferred”) has been fulfilled. Yet another example is a password checking program that inevitably leaks some information: even when a login attempt fails the attacker learns that the guess is not the password.
With this in mind, the Jif programming language [31,34] can be seen as the next milestone after the pure Denning-style approach. Jif is a subset of Java extended with information flow labels. As well as implementing an important distributed view of data ownership, the so-called Decentralised label model [32,33], Jif included the possibility of declassification, which provides a liberal information flow escape hatch for programs which would otherwise be rejected.
Paragon, a third-generation IF language. Declassification, in many shapes and forms, has been widely studied in the research community in recent years [40]. The large variety of declassification concepts is testament to the fact that there is simply no single right way to control the flow of information that goes against the grain. Moreover, it is not always natural to view information flow policies as consisting of “good flows plus exceptional cases” at all; in some situations there is no obvious base-line policy, and the flows which are deemed acceptable may depend on the state of the system at any given moment.
In earlier work [11] we introduced a new highly versatile policy language, Paralocks, based around the idea of Flow Locks. We demonstrated its ability to model a wide variety of policy paradigms, from classical Denning-style policies to Jif’s decentralised label model, as well as the capability to model stateful information flow policies. But the idea of using Paralocks as types in a statically-checked programming language was only demonstrated for a toy language. The question whether a Flow Locks-based policy language could feasibly scale to inclusion in a full-fledged programming language, to allow practical programming with information flow control, was left open.
The main contribution of this paper is to answer that question with an emphatic yes. We present the new programming language Paragon, which extends the Java programming language with information flow policy specifications based on an object-oriented generalisation of Paralocks. Not only does it turn out to be feasible, but the marriage of our stateful policy mechanism and Java’s encapsulation facilities yields a whole that is greater than the sum of its parts: it allows for the creation of complex policy mechanisms as libraries, giving even stronger control over flows and declassification than the policy language alone.
This article is a revised and extended version of a paper published in the proceedings for the 11th Asian Symposium on Programming Languages and Systems, APLAS 2013 [12]. Compared to the earlier version, we provide an updated, and much more thorough, overview of the features in the Paragon language. We further include more examples that use the features in practice, discuss fundamental philosophy of how to design Paragon policies, provide technical depth to our case studies, and make other changes and improvements throughout.
Synopsis. The remainder of the paper is structured as follows.
Section 2 presents the language and philosophy for policies in Paragon. The policy language is largely agnostic to the programming language with which it is used; hence the questions discussed here would be equally relevant, with only minor changes, to any programming language using a Paralocks-style policy language. Understanding policies is crucial for using Paragon properly, and in this section we discuss their intuition and syntax. We further discuss the process of designing proper policies for a given system, i.e., what considerations that need to be made when deciding what policies to use.
Section 3 gives a gentle introduction to the programming language Paragon through the presentation of a series of small example programs. Each such program highlights some aspects and features of Paragon, by presenting them in a natural context. The presentation of each feature is neither formal nor exhaustive; instead we include forward references to subsequent sections where the interested reader can learn about them in more depth.
Section 4 and Section 5 provide a more thorough and formal treatment of Paragon, aimed at expert readers. In Section 4 we comprehensively present and define the various features of Paragon, including a more formal treatment of its policy language. This section is aimed at readers who want a thorough understanding of all aspects of Paragon from the point of view of a language user. In contrast, Section 5 subseqently provides an overview of the implementation of Paragon, aimed at curious readers that want a look under the hood. This section includes a formal treatment of the most relevant parts of Paragon’s policy type checker.
Section 6 discusses our experience from two larger case studies, and gives further examples of proper Paragon policy design. Related work in Section 7 and conclusions in Section 8 round out the paper.
In this section we present the shape and meaning of policies in Paragon. We begin with a high-level example of a policy, to give an intuitive idea of the various parts involved. We then present the Paragon Policy Language – the specification language for policy annotations used by the type checker to determine what flows should be allowed. This is followed by a general discussion on patterns and principles for policy design.
High-level policy
Suppose we have a program which deals with two actors, a vendor and a customer. The program has access to the vendor’s secret data – a software activation key – which should not be permitted to flow to the customer unless the customer has paid for the software. To model the payment act we have a special boolean flag called a lock. Let us call this particular lock “Paid”.
The Paid lock, and locks in general, is used solely to specify when information may flow from storage locations to actors. The lock is a special variable in the sense that the only interaction between the program and the lock is via some instructions to open or close the lock; it can be seen as an abstraction of the program’s security-relevant state. In the case of our example program we would need to associate the opening of the Paid lock with the actual confirmation of payment in the code.
The idea is then that security policies are associated with information containers in a program. In the case of a variable holding a software key, the policy would then be expressed as:
the data may flow to the vendor, if the paid lock is open, the data may flow to the customer.
It is now up to the policy analysis to ensure that the data does not flow to the customer until the paid lock is open.
If at some later point the lock was closed again, perhaps because the customer’s access to the software key was only for a limited period of time, the data should no longer be accessible to the customer (though they would not be required to forget what they have already learned).
Some paragon declarations, including this policy, are given in Fig. 1. In the following section we will explain the specifics of the Paragon policy language, returning to this example along the way.

Policies for a Software Key (single customer, single key).
Note that what we have described here is an information flow policy specification mechanism – it allows the programmer to specify a flow policy in a program, and get guarantees that the program correctly conforms to the policy as stated. We make no attempts to address the issue of whether the policy itself is correctly stated, i.e., that the opening and closing of locks is done in the right places, and that data is labelled with the proper policies. As we shall see though, the combination of the policy language and Java’s encapsulation gives us further guarantees that help address these latter issues; we return to this in Section 3.
Policies are used to label information containers in order to specify information flow constraints on the data they contain. The subject of these policies are known as actors, and a policy consists of a set of clauses, each of which names an actor and specifies the conditions under which information may flow to that actor. The information flow conditions are special predicates called locks. In this section we describe each of these policy language components, and sketch how policies can be compared according to how liberal they are.2
The Paragon Policy Language is an object-based generalisation of Paralocks [11,50], but we will not describe Paralocks in isolation here.
Actors. The subjects of Paragon policies are the information-flow relevant entities, which we refer to as actors. An actor could represent a user, a resource, a system component, an information source or sink, etc.; any entity that is involved in some information-flow concern. Note that we deliberately do not use the term principal, used in many other systems [33,42] to denote information-flow relevant entities. We consider actors to be a more primitive notion than principals; a system using the notion of principals could use (one or more) actors to represent them, but actors can also be used more liberally to denote e.g. files or resources, entities that are typically owned by principals, but rarely modelled as principals in their own right. This discussion is related to the notion of information recipients, to which we return in Section 2.3. In the software key example of Fig. 1 the actors are the vendor and the customer.
In Paragon, actors are represented by objects. As a further example, the code fragment below creates regular instances of some
There are no special native Paragon classes to use as actors; any object created from any class can be used in this fashion, and
Policies. A policy is used to annotate information containers in the program (fields, local variables), and specifies to which actors the information in those containers is allowed to flow. The
A Paragon policy itself consists of a set of clauses, each specifying one particular actor, or one group of actors of a particular type: the head of the clause. In the software key example, the
Since every actor is an object, there is a globally most permissive policy, namely
In the electronic version of this article this and other Paragon-specific keywords and primitives are all rendered in the same colour.
A policy may have multiple clauses. The policy
Here data can flow to both actor
It is important to note here the difference in interpretation between information flowing “to an actor”, and information flowing to (the fields and methods of) the object which is used to represent the actor. The possible confusion arises since objects serve double-duty, both as ordinary Java objects, and as actor identifiers. The statement that some information may flow to some actor, say
Flow locks. A clause may have a body that constrains the states in which the information may flow to the actors specified in the head. These constraints come in the form of flow locks which represent the policy-relevant state of the system. In the example, the second clause of
This is an example of a simple lock. The power of Paralocks comes from the fact that a lock may have parameters representing actors. We illustrate this with a simple generalisation of the software key example. In this scenario there are potentially multiple customers. Each customer can access the software key if that specific customer has paid for it. Figure 2 contains policy definitions for this case.

Policies for a Software Key (multiple customers, single key).
Note that Fig. 2 (line 3) illustrates the more general form of flow lock which consists of a typed predicate – a lock family – and a list of actors forming the arguments to that predicate. In this example we model multiple customers by introducing a customer class. Instances of this class will be the representatives of actual customers. The policy
Let us consider a further example involving binary locks. The following code defines two families of locks, one for modelling the ownership of files, and another for the organisational hierarchy among users:
Individual locks in the family can be addressed by specifying the actor arguments of the correct type, e.g.
Apart from using concrete actors, policies can quantify over the arguments to lock families:
The policy
The policy lattice. As mentioned above, there exist both a least restrictive policy,
In the examples of both Figs 1 and 2, respectively, we have
since the policy of
But what if we combine information with policy
since the least upper bound of the policy of
The least upper bound operator has a lower-bound dual, the greatest lower bound or meet operator, ⊓, which will also be useful.
Interfacing with Paragon. To integrate the policy language into a host language, we need (i) a way to associate policies with data, and (ii) a way to interface with the lock state. In Paragon we allow policies to be attached to data sources and sinks, through the use of modifiers. We saw an example of this in the declaration of the fields of the
We discuss policy annotations and their consequences in more detail in Section 4.2.
To work with the lock state, we allow locks to be directly manipulated using the special statements
Before we dig into the raw mechanical aspects of policies; their syntax and semantics, as well as their effect on what flows are allowed (Section 4), we reflect on the process of designing policy annotations and placing them on various program elements. In general the question of how to develop and use policy annotations depends on what overall system policy should be enforced, and thus cannot be answered in the abstract. However, some basic principles and patterns are common to all policy design and specification.
Out-going channels. As noted earlier, the statement that “some information may flow to
The first step of policy design is thus typically to consider the outward-bound end-points of the system, and assign policies to them, indicating who can be expected to gain access to the information after it leaves the system through that end-point. In the following, we will refer to such actors as information recipients. Consider for example a call to
Depending on the specific scenario, this may or may not be a reasonable expectation. For instance, in a setting where the end users communicate remotely with the system, information printed to
The choice of how to represent the information recipients is a natural corollary to the question of what policies to put on end-points. Are there more than one sort of recipient to whom information may flow? If so, the different sorts of recipients should typically be modelled using different types. Does the system need to identify many different individual recipients of the same type (e.g. the
Integrity and confidentiality. An interesting aspect of the choice of representation for information recipients is the classic duality between confidentiality and integrity, exhibited by most systems that include both components [7,8,34,42]. In Paragon this duality does not manifest – instead, Paragon always asks the basic question: under what circumstances may certain information flow to some actor. Paragon is agnostic as to whether the restrictions on flows arise due to concerns for confidentiality or integrity. For an information recipient to whom flows should be restricted for both integrity and confidentiality reasons, a useful and powerful pattern in Paragon is to model that recipient as a pair of actors; one for each kind of restriction. An out-going channel for flows to recipient
There are several benefits to using this pattern. Firstly, by giving the different kinds of actors different object types, Paragon will know when joining policies to keep the different concerns from becoming mixed up. This potentially makes complex policies, formed when combining information from several sources, easier to read and understand. Secondly, this split also easily allows for also having (channels to) recipients that care about only one kind of concern but not the other. For example, the recipient
Input sources. Once the decisions have been made regarding recipients, the policies on information input sources to the system should be crafted with those different recipients in mind. If any part of some information from an input source should ever be allowed to influence information flowing to some recipient, the actor(s) representing that recipient must appear as the head(s) of some clause(s) in the policy assigned to that input source. If not, Paragon will never, under any circumstances, allow such a flow. No catch-all, Alexandrian declassification operator exists that would allow adding potential recipients at a later point.
For each actor representing some (aspect of a) recipient, the restrictions that limit the information’s flow to that actor are then added. This amounts to identifying the security-relevant events of the system; modelling these using appropriate lock families; and deciding how they interact with the actors in the system. There are many different patterns for expressing restrictions on flows through locks, e.g. declassification, access control, temporal restrictions, relationships, etc. We showcase these patterns in the examples in next section. At this point, more kinds of actors can be introduced; actors that are not active recipients of information and therefore never appear as heads of clauses. These extra actors typically represent e.g. specific information sources, resources, or schemes. We show a few examples of such actors in the later sections: the
Internal entities. What remains in the process of building Paragon code is to assign policies to internal entities, i.e. fields or methods that are neither input or output channels. The policies to assign typically follow from the use of those entities, and the flows to and from sinks and sources. Both read and write effects should be specified. This is usually less of a design task and more a question of calculation and specification.
Special care should be taken when assigning policies to channels that allow information to temporarily leave the system, such as writing to a file or a database. Those policies should be assigned to allow neither backdoor channels (e.g. information written to a file read by a recipient through means outside the system) nor laundering (information is read back into the system with a less restrictive policy than it had when it was written).
In the next section we now proceed to show a series of examples, showcasing both the various language features (discussed more formally in Section 4), and the design principles presented here.
Paragon by example
In this section we explore Paragon through a series of small examples, which serve a two-fold purpose. First, these examples allow us to put the elements of Paragon introduced in the previous section into context. Second, it lets us demonstrate the generality of Paragon as an implementation language for a large variety of different policy mechanisms, and how, by the use of encapsulation, we can present each mechanism through a consistent interface.
Simple declassification
Our first example is the classic information flow idiom of a two-level confidentiality lattice with a simple declassification mechanism, showing succinctly how class encapsulation gives us the possibility to encode a policy scheme as a library. The interface of this scheme consists of three elements: policies for data that is secret (“high”) and public4
Not to be confused with Java’s notion of “public”, i.e. exported from a class.

Simple declassification.
We define the policy
High data may be made visible to low observers through declassification. We represent this with a condition (lock)
The policy
There are several interesting things to note about this method declaration. First, it shows the use of policy annotations as modifiers on variables and methods. Here we see that the formal parameter
The body of the method consists of a single statement: a scoped
Finally, we note that this method is now the only way to declassify data from
The library exposes only the basic building blocks to the application programmer: the primitive policies
For an example use of the
To further demonstrate the modularity, we could easily extend our library with the notion of data that may never be declassified:
Data annotated with policy
Our next example is a small application, which has been used as an example in several earlier papers [2,10,11]: a server for running online sealed-bid auctions. We want to model the following information flow properties:
bidders provide sealed bids and can see their own bid, but cannot see each others’ bids;
bidders may learn something about the winning bid (and hence implicitly something about all bids), but only at the end of the auction.
The class

Sealed-bid auctions –

Sealed-bid auctions –
We note that the policy here, unlike those used in the previous simple examples, is not marked as
Each
Note that the out-going
When the bidder supplies a bid as requested, by a call to the method
Two things are worth noting here. The first is the modifier
Running the auction now consists of four phases: Getting the bids from all the bidders, determining the winner, reporting the results, and handing out the spoils. The implementation of these actions in the class
We first declare a policy
The first phase, method
In the next phase, method
The local variable
Also noteworthy is that the assignment to
For explicit flows – such as assignments – the current lock state is taken into account; for implicit flows – such as conditionals – it is not. It is for that reason that we need the
Next we want to notify the bidders about the winning bid, calling the method
Leaving the implementation of
Next we present the scenario of a simple social network, illustrating the three generations of information-flow control policies described in the introduction. In the network, users can befriend each other and share messages in the form of posts that can be read by their friends. The scenario contains two information flow policies that we want Paragon to enforce.

A simple social network application written in Paragon.
First, posts can only be read by a direct friend of the poster or, if the poster so indicates, by friends of friends of the poster. A user can decide, per post, whether it should be shared with friends-of-friends or not. Paragon should thus enforce that the network properly checks the friendship relations before allowing a user to read a post.
Second, to prevent injection or scripting attacks, a message should be properly sanitised before it is stored in the network. That is, we want to enforce the policy that all posted messages first pass through a sanitising function.
The Paragon implementation of this network is shown in Fig. 6. Some policy annotations are omitted in the implementation, since Paragon provides default policies in these cases. For example, all fields that do not specify a read effect automatically get the least restrictive policy
The relevant information recipient actors in this scenario are the users, modelled by reusing the
With the lock families in place we can now create the desired policy as
As an effect of calling this method the array
The user’s
Leveraging Java’s encapsulation mechanism we are able to provide the ingredients for the sanitisation policy entirely as a separate library. Following the pattern shown in the declassification example (Section 3.1), the lock
The example demonstrates the three different generations of information-flow control policies and how Paragon models them:
As per traditional non-interference, some flows are never allowed in the network. For example, Paragon enforces that a posted message can only flow to users in the network, and not to any other channel. We see an example of the exceptional information declassification pattern in the sanitiser library: the
Our last example is an encoding of an information flow idiom in which permitted flows are specified using lexically scoped declarations, reminiscent of the work by Boudol and Almeida Matos [2]. The basic idea is a language mechanism that allows flows between security levels inside a lexically enclosed scope:
In this example, within the scope of the enclosed block information owned by (or that could flow to) security level

Encoding and encapsulating Lexically Scoped Flows in Paragon.
Encoding. To encode the scheme we first introduce the binary lock family
With this lock family we can easily encode the flow mechanism using the scoped
Lastly we need to encode the policy annotations used with this scheme. Given data of some security level, this data should be allowed to flow to other levels as well when the proper flows are enabled. Thus data with security level
The three components we have defined here – the
Encapsulation. The second step is the encapsulation, making the lock family read-only and exporting only two things: a way to construct consistent policies, and a method representing the flow construct.
To encapsulate the construction of policies we export the
The flow construct should take the code to run as an argument, but our host language Java does not support first class statements or procedures. The standard Java approach is to wrap the code to pass as a method of an anonymous object implementing a declared interface, and then pass the whole object as argument to the flow method.5
Java has introduced lambdas which are typically used for situations like these; however, lambdas do not play well together with type arguments so we cannot make use of them here.
The levels
To allow arbitrary side effects in the code block the interface receives an additional policy type argument
Finally, we wish to nest multiple
With these definitions of

Using Lexically Scoped Flows in Paragon.
Having covered many of the language features through small examples, we now present more detail of the main features of Paragon. Paragon is largely an extension to the Java language and type system. Our choice for Java is motivated by its relatively clear semantics and the wide adoption of Java in both commercial and academical settings. In addition, it allows us to reuse existing ideas from, and simultaneously compare Paragon with, Jif [31,34], the only (other) Java-based full-fledged security-typed programming language to date. We discuss Paragon’s relation to Jif in more detail in Section 7. We do not, however, rely on any particular features of Java for the integration of our policy language to work, and posit that it would be equally feasible to do this for other statically typed languages with safe memory management, e.g. ML or Scala.
In this section we give a systematic review of the significant features that Paragon adds to the Java language. While the section can be read as a coherent narrative, it is also structured as a reference to the various features, each presented in a separate subsection. For a more gentle, less technical, introduction to Paragon, we refer the reader to the previous section, as well as our Paragon tutorial [51]. Here we cover the following sections in turn:
information flows and policies, discussing how policies determine which flows are allowed (Section 4.1); policy annotations, discussing how policies are associated with data (Section 4.2); policy inference and defaults, discussing different ways in which Paragon aims to reduce the need for explicit annotations (Section 4.3); policy expressions, introducing policy operators, type methods and runtime policies (Section 4.4); lock state analysis, discussing the exceptions and indirect control flow (Section 4.6); and finally type parameters (Section 4.7).
Information flows and policies
The various flows of information that need to be controlled in Paragon are essentially the same as the ones occurring in Java. As is common in information-flow analysis we make a distinction between direct and indirect information flows.
Direct flows. The typical direct flow is an assignment, where information flows directly from one location to another. Direct flows also happen at method calls (arguments flow to parameters), returns (the returned value flows back to the caller) and exception throws (an exception value flows to its enclosing catch clause).
As an example, let
Indirect flows. An indirect flow is one where the effect of evaluating one term reveals information about a completely different term that was evaluated previously. The typical indirect flow is a side-effect happening in a branch that reveals which branch was chosen, which in turn reveals the value of the conditional expression that was branched on. Indirect flows also arise from other control flow constructions (including loops and exceptions), and field updates or instance method-calls (possibly revealing the object they belonged to).
Due to the delayed nature of these information flows, the lock state in effect at the time of the indirect flow might be different to that in effect at the point at which it is revealed. Therefore, indirect flows are handled conservatively, by not allowing the lock state to affect which of these flows are considered secure.
Policy annotations
When integrating the policy language into Java, the two core design issues are (i) how policies are to be associated to data, and (ii) how the lock state is specified, updated, and queried.
In Paragon every information container (field, variable, parameter, lock) has a policy detailing how the information contained therein may be used. Every outgoing (or incoming) channel similarly has a policy specifying what it means to have information leave (or enter) the system through that channel. Further, every expression has an effective policy which is (an upper bound on) the conjunction of all policies on all containers whose contents affect its resulting value – we refer to this as the expression’s read effect.
Paragon separates policies from base (Java) types syntactically by having all policy annotations as modifiers. Examples of ordinary Java modifiers are
Similarly, every expression (and statement) has a write effect, which is (a lower bound on) the disjunction of all policies on all containers whose contents are modified by the expression. Write effects allow us to control implicit information flows, by limiting the contexts in which expressions with side-effects may occur. A modifier
Policy inference and defaults
To reduce the burden on the programmer to put in policy annotations, Paragon attempts to either infer, or supply useful defaults for, policies on variables, fields and functions. The policy defaulting and inference mechanisms, the latter based on work by Rehof and Mogensen [38], are very heavily influenced by those used in Jif [34]. We describe them in this section for the sake of completeness.
Policy defaults. For fields, omitting the policy annotation is interpreted as having no information flow concerns for that field. Hence, the default policy for fields is simply the most liberal policy,
Methods have policies both on their parameters and on their return value (if any). Omitting the policy annotation on a parameter makes the method parametric in the policy of the corresponding argument. Paragon supplies a special construct
For the return policy, the default assumption is that the returned value depends on all the arguments given to the method, so the default return policy is the conjunction (join) of the policies on all parameters.
As a typical example of a policy-parametric method, consider
The default write effect for a method is
Policy inference. Local variables serve a common role as temporary information containers. Omitting the policy annotation on such a variable is interpreted as the programmer not caring what policy it has, so long as that policy does not violate any information flow concerns specified elsewhere. Paragon therefore tries to infer policies for local variables such that all information flows that involve them are deemed safe by the type checker. We discuss the technical details of policy inference in Section 5.
Policy expressions
So far we have seen many examples of literal policies, that is, policies constructed directly using clauses of actors and locks. Policies can also be constructed programmatically in various ways. They can be used as values at runtime, and can be dynamically hoisted to the type level to be used in modifiers. To ensure consistency across e.g. several calls to the same method, only final fields or variables may be used in the construction of policy modifiers. Again this decision is directly influenced by the same behaviour in Jif.
Policy operators. Policies can be created by combining other policies using the overloaded operators for join (
Type methods. To allow for the creation of complex policies in libraries, we must allow policies to be generated by methods, which forces our policy analysis to also analyse the behaviour of method calls. To avoid a situation where the precision of the analysis appears unintuitive and ad-hoc to the programmer, we allow programmers to explicitly mark which methods the analysis should take into account with the modifier
A more formally correct name would perhaps be type functions, since these methods must be both pure, i.e. have no side-effects, and deterministic. By deterministic we mean that the end result may only depend on values known statically when the method is called. That includes the method’s arguments, as well as certain static fields; specifically, for a field to be usable in a type method, it must be static and final, have a primitive type, have the least restrictive policy (i.e. bottom) applied to it, and have a simple initialiser that is itself pure and deterministic. These constraints are checked by our compiler for methods marked as type methods.
Runtime policies. Some policies for information simply cannot be known statically during type checking. They may be read from files or database entries, or constructed to model the UNIX privileges for a file, or based on other external and dynamic conditions.
To handle such runtime policies, the analysis must approximate upper and lower bounds for all policies based on what information is available for them. To improve precision, we need ways to relate policies that are not known statically to other (static or runtime) policies, to improve precision. We let our policy analysis be guided by inequality constraints between policies appearing as the condition in
Here we assume that
This problem has been studied by Zheng and Myers in the context of Jif [57], and the solution presented here closely follows theirs.
Lock state analysis
Manipulation of the lock state is done programmatically through the use of the Paragon-specific statements
Paragon also provides a lexically scoped version of the
Runtime lock queries. Locks in Paragon are not first class. They cannot be stored in variables, nor can they be passed as arguments to methods. The only way to manipulate the status of a lock is via
Lock state modifiers. To facilitate modularity, Paragon introduces three modifiers, used on methods and constructors, to specify their interaction with the lock state:
The opens and closes modifiers are also used to annotate each exception type thrown by a method, to signal to the analysis what changes to the lock state can be assumed if the method terminates by throwing an exception of that type.
Lock properties. A lock can be declared to have properties. A property specifies conditions under which some locks are implicitly open. For example, we might want to express that the acts-for relation defined in Section 2.2 is transitive and reflexive. This requirement can be stated at the point of declaration as follows:
These clauses look similar to those of policies, however the heads here represent locks that are implicitly open, while in policies they represent information recipients. We follow the Java tradition of forcing variables to be declared before they are mentioned, even though the types could easily be inferred in this example. This is not always the case though; specifically, we could have rules that declare some locks to be implicitly opened for actors that are members of a specific subclass.
Transitivity and reflexivity properties (as well as symmetry) are a common pattern, so Paragon provides syntactic sugar for these:
Since lock properties must be attached to the declaration of a lock, they are a modest restriction of the Recursive Paralocks, discussed in [11]. However it turns out that this restriction make the policy operations decidable, and the policy comparison operation specified in [11] becomes complete (as opposed to just being sound). Full details are given in [50].
Actors and aliasing. Together with locks, actors, represented by object references, play a crucial role in the typing of Paragon code. Locks determine what flows are allowed at what points, and locks are often parameterised by actors. The typeability of some code may depend on a given lock, with some given actor arguments, being open. Formally, the type checker treats actors as singleton types [4].
However, the possibility of aliases complicates things. If some code opens lock
We discuss the technical aspects of alias analysis in more detail in Section 5. For the purpose of language presentation we note that, like with runtime lock queries guiding lock state analysis, or runtime inequality constraints on policies help increase precision, we let conditions involving tests on reference (in-)equality help make our alias analysis more precise.
Modifiers on lock families. Since locks may be queried at runtime, they are information containers, and, like other containers of information, need policies to dictate how that information may flow. The policy on a lock family becomes the read effect of a lock query expression, as well as the write effect of any
To avoid further issues with aliasing of locks, all locks are static; there is only ever one version of a lock family, not one per instance of the class in which it is declared. This restriction is without loss of generality, since the behaviour of having one lock family per instance can be achieved by giving the lock family one more parameter of the appropriate type, and use the instances as arguments. Thus, instead of
A lock may be exported as
Exceptions and indirect control flow
The static policy type system in Paragon tracks two kinds of information flows: direct flows arising from assignments, and indirect flows arising from control flow. It makes no attempt to track flows arising from termination – it is termination insensitive. If exceptions could not be caught, an exception would be the same as (premature) termination, which means we would not need to care about them. However, the catch mechanism makes exceptions rather a kind of control flow primitive, needing special attention.
All exceptions in Paragon must be checked, i.e. declared to be thrown by methods that may terminate with such exceptions. This implies the need for analyses that can rule out the possibility of exceptions, in particular for null pointers, to avoid a massive blow-up in the number of potential exceptions that must be declared. Paragon adds the modifier
A caught exception is in essence a jump, where control is passed from the throw point to a catch block. Such a jump may be noticeable by anyone who can notice the catch block being executed, or the statements in the normal control flow past the throw point. To avoid unintended flows, all such statements are constrained by the context in which the throw appears. We refer to this as the exception’s area of influence.
Since an exception might not be caught locally, the area of influence is not a local property in general. To handle this modularly we let methods that throw exceptions declare the write effects of those exceptions as modifiers on the exception types in the method’s
Since uncaught exceptions are effectively premature exit points from a method, the opens and closes modifiers pertaining to a normal exit do not apply when entering a catch block. Hence we let the declared exceptions also take opens and closes modifiers, specifying the lock state that will be in effect at the start of a corresponding catch block.
For the cases where a thrown exception is caught locally, before ever reaching the top level of a function, there will be no need for approximations via declared policy or lockstate modifiers. Instead, all the necessary information can be computed locally.
Interestingly, several other control flow mechanisms in Java can be treated as special cases of exceptions for purposes of policy inference:
Type parameters
Java, since the introduction of “Generics” in Java 5.0, allows types and methods to be parameterised by types, giving Java parametric polymorphism. Paragon introduces several new entities – actors, policies and locks – that affect typing in various ways. It is natural to extend the polymorphism to also include these aspects. To ensure that the different entities are handled correctly, we need a kind system to signal which kind of entity we expect for a given parameter.
A typical use of a policy type parameter can be seen for a collection, e.g.
In Paragon, those same elements have both a type
Arrays are a special case of collections, and the same pattern is repeated with elements having a common policy. The syntax differs slightly; we write
In Paragon, ordinary reference types have the implicit kind type. Type parameters of kind type need not be annotated, like in vanilla Java. For the Paragon-specific entities we introduce kind annotations, as seen in the example above, to separate them from each other and from ordinary types. Since actors are modelled by objects, we use an actor’s reference type as its kind. For policies we can simply reuse their type as kinds as well.
We can do the same for locks though we typically want to parameterise over not just single locks, but rather sets of locks. To avoid introducing new keywords, we reuse the syntax for arrays for this purpose, i.e. the kind annotation on parameters taking sets of locks is
With Generics, the Java type checker tries to infer type arguments where none are provided. Paragon does the same for missing type arguments, with a slight generalisation – we allow partial type argument lists, which are assumed to instantiate the type parameters of the method or constructor from left to right.
Enforcement of Paragon policies
Enforcement of information flow policies in Paragon is no small task, and presenting the information flow type system in its entirety is beyond the scope of this paper. Instead, we sketch a high-level overview of the most important analyses involved, presented as a sequence of phases. Each phase serves a purpose as an abstract interface; we specify what information we expect the phase to provide, but the concrete choice of algorithms is orthogonal to the other phases. We then present our formal type system for the two most interesting phases from a pure Paragon perspective: Lock state evaluation and Policy constraint generation.
Overview of phases
Phase 1: Type checking. The first phase roughly corresponds to ordinary Java type checking, albeit with some additions for Paragon-specific constructs. Particularly, we must assure that arguments to locks are type correct, and that policy expressions used in modifiers are indeed of type
The important thing to note here is that the checking of ordinary Java type correctness is largely unaffected by the addition of information flow policies, and can be performed prior to the analysis of information flow.
Phase 2: Policy type evaluation. Locks, policies, and objects all play a dual role, both as type-level and value-level entities. In this phase the values of each of these entities are statically approximated. For locks we ensure that, whenever a lock is queried, the information in the query is propagated to the respective branch (or loop body).
For fields and variables holding actors, i.e. object references, approximating their runtime values means performing an alias analysis. Our present analysis is simple but has performed well enough in practice. However, work is in progress to improve its precision by adapting the work by Whaley and Rinard [54].
Since policies can be used as values at runtime, and dynamically hoisted to the type level, our analysis needs to approximate policies as singleton types, similar to the analysis of actors. For each field or variable storing a policy, and for each policy expression appearing in a modifier, we thus calculate upper and lower bounds on the policy held by that variable at runtime.
Further, we need ways to relate policies that are not known statically to other (static or dynamic) policies, to improve precision. Similar to runtime lock queries, we thus let our policy analysis be guided by inequality constraints between policies appearing as the condition in
Phase 3: Lock state evaluation. The next phase approximates the lock state, i.e. a static (under-)approximation of the set of locks that will always be open at each program point. This amounts to a dataflow analysis over the control flow graph, to properly capture the influence of method calls and exceptions, and to handle loops. Each program point where a direct flow takes place is annotated with the lock state in effect at that point. Direct flows are limited to the following four language constructs: assignments, method calls (and instance creation), returning values, and throwing exceptions.
The details of approximating the lock state are shown in the next section.
Phase 4: Policy constraint generation. The constraint generation phase will result in a set of constraints on the form
The full details of constraint generation are shown in the next section.
Phase 5: Policy constraint solving. The last phase solves the generated constraints, on a per-method basis. A solution to a set of constraints is an assignment of policies to constraint variables that satisfies all the policy comparison constraints. The algorithm needs only determine whether there exists a solution, and does not need to actually produce one. The constraint solver is based on the algorithm presented by Rehof and Mogensen [38].
Paragon type system
In this section we present the formal type system for the parts of the analysis corresponding to lock state approximation and policy constraint generation, for a sizeable subset of Java. Our implementation covers a larger subset still, but here we leave out a number of features that do not add anything to the presentation. More specifically, the features left out are enums, static fields, arrays (as in the syntactic sugar), inner classes, casts, most operators, labeled statements, as well as expressions and statements whose typing would be very similar to those already covered (e.g.
Lock state approximation. Let us first look at lock states, and define some convenient operations. First, in our analysis a lock state is represented concretely as the set of locks being open in that state. We will also have use for the concept of an unreachable state, represented with the distinguished value
Next, we will have use for the concept of a lock modification, representing a set of changes to be applied to a lock state. We model this concretely as a pair
We now need two (overloaded) operations, ⊳ and ◇, respectively corresponding to sequential and parallel composition of lock states and lock modifications. We have that:
Now we can turn to the analysis itself. The behaviour of the analysis is captured by the following declarative judgment, covering both expressions and statements:
X maps exception types (and the pseudo-exceptions
We also assume a global environment M, which is a mapping from reference types and method names to a four-tuple, where the first three components are the locks that method respectively expects, opens and closes, and the final component,
Further, a rule for this judgment is valid only if each occurence of the mentioned language constructs is annotated with the correct lock state.
We will consistently use superscripts (sometimes in conjunction with square brackets for scoping) to denote annotated information, while subscripts or primes serve to disambiguate terms. Technically, an expression e will also be annotated with information from previous passes. We will omit such annotations when they are not needed by the rule.
Looking at expressions first, we omit the rules for literals, the expressions
This rule deals with variable assignment. The rule for field assignments is analogous; for primary field assignments (where the object is calculated from an expression) or array updates, the only difference is that the sub-expressions of those left-hand sides are analysed first.
The rule for method calls is more involved:
where
Finally (for expressions), the rule for the conditional operator:
The interesting thing to note here is that we expect the condition to be annotated, from a previous phase, with any locks known to be open or closed if the condition is true and false, respectively, and we take this into account when analysing and annotating the expressions in the branches. The annotation
Turning to statements, empty statements and expression statements add nothing interesting, and
Most central to this phase are the
The
The rule for
where
For thrown exceptions we have the following rule:
Three things to note: first, the lock state after this point is
Finally we have the rules for
where
The rule for
where
Policy constraint generation. This phase is captured by the following judgment, where we assume that the expression e has been properly annotated from the lock state approximation phase:
p is the effective policy of the expression (i.e., the policy of the value of the expression); θ is the set of generated constraints, of the form
The judgment for statements is similar, except that they do not return a value, and hence do not have a policy.
We implicitly assume an environment E containing policy signatures for fields, variables, methods and locks, as well as the declared return policy of the enclosing method. We also assume a set of global lock properties G, which would formally be added as a subscript to every constraint, e.g.
The rules for literals,
The only interesting thing to note is that the policy is the join of the policies of the containing object and the field. The rule for binary operators is very similar:
In the rule for conditionals we extend the branch PC to constrain indirect flows in the branches:
An assignment constitutes an actual information flow, so the rule for assignments is where many of the constraints arise:
where
Finally, the rule for method calls:
Here we assume
The constraints in θ here are similar in spirit to the ones for assignment, only here the write effect of the method,
The outgoing exception PC map is updated to take into account all exceptions potentially thrown by the method.
For statements, the rules for empty statements and expression statements are uninteresting, and the rule for
The rule for
where
Note in particular that the entire loop, including the condition itself, is constrained by the policy of the condition.
The rules for
where
The rules for the pseudo-exceptions are even simpler, so we omit them.
For
Note first that we assume the parameter of the catch block to be annotated with its calculated policy
where
Finally we look at the Paragon-specific statements
The rule for closing is identical. Note the similarity to the rule for assignments.
This concludes our presentation of the formal type system for the two most interesting phases, from the perspective of Paragon policy checking.
Paragon implementation
We have implemented Paragon in a compiler that performs type checking for policies, and compiles policy-compliant programs into vanilla Java code. While the enforcement system was presented in a modular fashion above, for efficiency the current implementation performs most of the phases simultaneously.
Once we know that a given program satisfies the intended information flow properties, we can safely remove all Paragon-specific type-level aspects of policies, locks and actors. We must still retain the runtime aspects, such as querying the lock state and performing inequality comparisons between policies.
The Paragon runtime library provides Java implementations for locks and properties, including operations for opening, closing and querying locks to which the Paragon
Compiler statistics. Our Paragon compiler is written in Haskell and comprises roughly 16k lines of code, including comments. Approximately half of that code is due to our policy type checker, and only a small fraction, just over 600 lines of code, deals with generation of Java code and the Paragon interface files needed for modular compilation. Of the roughly 8k lines of code for the type checker, slightly less than 1k are used for the policy language, including constraint solving. Most of the complexity for the remainder comes simply from the fact that Java is a large language, with many syntactic constructs. Note that, as per phase 1, our policy type checker includes a (superset of a) full type checker for ordinary Java types.
On top of the compiler itself, some 1500 lines of Java code are written for our runtime representations of Paragon entities.
Runtime overhead. Supporting lock queries and policy comparisons at runtime yields a negligible overhead on Paragon programs. Most of the additional generated code handles the initialisation of policies and locks upon class or object instantiation, as well as the opening and closing of locks, which does not give any significant performance penalty. More involved are the lock queries and policy comparisons themselves since they resemble essentially Datalog program evaluation and respectively containment [50]. However, our experience shows that clause bodies consist of just a few atoms, and have yet to find an example involving locks with arity higher than two, so in practice we posit that this overhead is negligible as well.
Case studies
We put the language and our implementation to the test with two case studies, both based on applications written in the Jif programming language, to which we further relate in Section 7. We note that what we test here is not performance metrics of the compiler (beyond simply noting that, for the cases at hand, the compile time overhead is negligible). Nor do we aim to stress test the precision of any of the various algorithms underpinning the implementation. What we look at here is the expressiveness in terms of policy paradigms, and whether our analysis is precise enough to check programs that use those policies. Scalability is measured in a syntactic sense: whether having more complex and varied policies, and more data items associated with such policies, will lead to an explosion of policy annotations in the source code, or remain at manageable levels.
Mental poker. In [3], a non-trivial cryptographic protocol for playing online poker without a trusted third party is implemented in Jif. The protocol includes two mutually distrusting players, each in the possession of a public-private key pair and the public key of the other player.
During the distribution of the cards, players communicate cards encrypted with a per-player, per-game symmetric key. That is, the receiving player cannot decrypt the received card since the symmetric key under which it is encrypted was generated by the sending player. At the end of the game the players reveal their symmetric key such that the other player may verify the outcome of the card distribution. For the purpose of non-repudiation each player signs outgoing messages with her private key.
From an information-flow perspective we desire an implementation of this protocol to satisfy various policies. For each policy we present a simplified extract of our implementation (6.5k lines, available from [36]).
The cards to be communicated should not be sent before they are encrypted with the symmetric key and then signed.
We first consider the signing step. This is implemented following the declassification pattern described in Section 3.1. Using a private lock, we are certain that data labelled as
Next, we use the same pattern to ensure that unencrypted data has to pass through the
By annotating the cards with
The public key of a player is visible to everyone, as it is used to verify the player’s signatures, but the private key should never leave the player’s client.
Information about the private key necessarily leaks from the signing operation. Since we trust the implementation of
The symmetric key should remain confidential to the player until the end of the game.
Like the private key, the symmetric key is partly revealed when it is used in the
Encoding these policy requirements in the Paragon Policy Language appears straightforward and natural. By contrast, the original case study written in Jif uses owner-based policies specified in the Decentralized Label Model (DLM) [33]. The Jif policies can simply state whether the cards are owned by a given player or not, and cannot, in an obvious way, express anything beyond that.
In Paragon we can use encapsulation of locks to create multiple different declassification-like interfaces. Although Jif programs have access to exactly one declassification mechanism, different forms of declassification (such as encryption versus input sanitisation) could be encoded via the use of dynamic principals. By introducing a dedicated principal for each declassification and then encapsulating these principals, the power of declassifiation is limited to this encapsulating code. This pattern was however not used in the Jif version of this case study.
Jif does not provide a means to specify temporal constraints and thus cannot express a policy similar to
JPMail. The second case study implements a functional e-mail client based on JPMail [25]. In JPMail the user can specify a mail-policy file, partly dictating the information-flow policies that the mail client has to enforce. The mail-policy file links JPMail user identities with operating system (UNIX) identities, and specifies which encryption algorithms are trusted by each JPMail user. Here we describe only the encryption-preferences related elements, presented in the style of this paper. For the full implementation (2.6k lines) we refer to [36].
JPMail ensures that an e-mail is only sent if its body has been encrypted under an encryption algorithm that is trusted by the receiver of that e-mail. In addition JPMail enforces more static policies, e.g. preventing the login credentials from flowing anywhere else than to the e-mail server. Such static policies can be easily modelled by creating a dedicated actor representing the server:
The e-mail body can then be annotated with a policy that only allows this information to be sent to the server if it is encrypted. In addition, the encryption algorithm used needs to be trusted by the receiver of that particular e-mail:
The appropriate
The
Specifying an
Due to the genericity of the
The issues for the Jif implementation in the mental poker case study show up in the JPMail example [25] as well. Moreover, stateful policies (here modelled by the
Related work
In this section we consider the related work on languages and language support for expressive information flow policies. We focus on actual systems rather than theoretical studies on policy mechanisms and formalisms. We note, however, that there are several policy languages in the access control and authorisation area which have some superficial similarity with the Paragon Policy Language, since they are based on datalog-like clauses to express properties like delegation and roles, see e.g. [5,20,27,28]. Key differences are (i) the information flow semantics that lies at the heart of Paragon, and (ii) the fact that the principal operation in Paragon is comparison and combination of policies, whereas in the aforementioned works the only operation of interest is querying of rules.
Languages with explicit type-based information-flow tracking. Two “real-sized” languages stand out as providing information-flow primitives as types, namely FlowCaml and Jif – as discussed in the Introduction.
FlowCaml is a subset of OCaml extended with information flow annotations on types. Although FlowCaml only supports simple lattice-based security policies with little flexibility, it is notable that full ML-style type inference is supported, and a meta-theory which covers both this and information-flow soundness [37].
Due to the unique position Jif has enjoyed in the domain of information flow research over many years, much research has been done using Jif for context and examples. In many aspects, our work on Paragon has greatly benefited from Jif’s trailblazing, as well as research done in the context of Jif. Policy defaulting mechanisms, handling of runtime policies, and having all exceptions checked, are all features where we have been able to adopt Jif’s solution directly.
The main advantage of Paragon over Jif is undoubtedly the flexibility of the concept of locks, including their stateful nature. Where Jif has a single declassify construct, Paragon can provide different declassifying methods to work on different data, as needed by the domain at hand, and relate that declassification to the state of the program. Jif rigidly builds in some stateful aspects in the form of authority and delegations, which in Paragon would just be special cases of working with locks.
Jif has been used as the core of a framework for building information-flow verified web applications [16], and its current design is influenced by such dynamic applications.
In separate work, as of yet unpublished, we have conducted a complete and in-depth comparison between the two languages and all their features, including a Paragon library that gives a complete implementation of Jif, but the full details of that comparison are out of scope for this paper.
Compilers performing IF tracking. The JOANA tool (Java Object-sensitive ANAlysis) [22] is a framework for checking information-flow policies for Java. It stands out as being (i) very precise relative to e.g. Paragon or Jif (i.e. it is likely that it exhibits fewer false positives for bad information flows through a sophisticated program analysis [24]) and (ii) by covering the full language including concurrency [21]. Policies are classical multi-level security lattices augmented with simple declassification annotations in the programs.
Information flow tracking can be performed in a language which has no inherent security policies, lattice-based or otherwise. In such a setting one tracks the way that information flows from e.g. method parameters to outputs. An examples of a tool performing such analysis is the Spark Examiner, operating over a safety-critical subset of Ada [15].
In our comparison of related work we have focussed here on sound tracking of information flows. It is worth mentioning a family of unsound analyses – the so-called taint analyses – which deliberately ignore all indirect information flows. Taint analysis can be useful for debugging information flow errors, and in practice has achieved scalability and industrial application beyond the state of the art for sound analyses; see, for example, IBM’s taint analysis for Java [49].
Runtime information flow tracking with expressive policies. Runtime information flow tracking systems have experienced a recent surge of interest. The most relevant examples from the perspective of the present paper are those which perform full information flow tracking (rather than the semantically incomplete “taint analysis”), and employ expressive policies. The first example is from Stefan et al. [43], an embedding of information flow in Haskell, the LIO system, which implements a runtime monitor for arbitrary lattice policies.
Buiras and van Delft extend LIO to allow policies to be state-dependent, yielding SLIO [13]. This, in principle, provides a framework in which Paralocks-style policies could be implemented by dynamic checking. One of the case studies is the simpler Flow locks policy language [14].
Yang et al.’s Jeeves language [55] focuses on confidentiality properties of data expressed as context-dependent data visibility policies. The Jeeves approach is noteworthy in its novel implementation techniques and greater emphasis on the separation of policy and code. A somewhat related approach in both policy and implementation (in our opinion) is work by Vanhoef et al. [53] on stateful declassification properties, implemented in a web browser using secure multi-execution based methods [17].
Encoding information flow policies with existing type systems. With suitably expressive type systems and abstraction mechanisms, static information flow constraints can be expressed via a library [29,30,39].
A number of recent expressive languages are aimed at expressing a variety of rich security policies, but do not have information flow control as a primitive notion (as Paragon or Jif) [26,48]. For example, the authorisation policy language Aura can be persuaded to model information flow and declassification polices [26]. Fable [48] focuses on the general idea of label-based policies, allowing user-defined labels and typing constraints (via dependent types). One example is the encoding of a standard information flow lattice policy. A weakness of this approach, according to [47], is that “verification depends on intricate security proofs too cumbersome for programmers to write down”. These concerns are in part addressed by Swamy et al. [47] which introduced F*, a fully-fledged implementation of a dependently typed programming ML-style programming language, which is the culmination of a series of languages (from the same group) including Fine [46], FX [9], and F7 [6]. Finally, the Relational Hoare Type Theory (RHTT) presented by Nanevski et al. introduces a novel language and verification system that allow for the formalisation of various security policies, including Paralocks security, via the means of type-theoretic constructions [35]. The authors argue that most of the work in the F* series, apart from their path-sensitive assertions, can be encoded in RHTT thanks to its more general type system.
Typestate. The way that Paragon tracks locks is related to the concept of typestate [44]. Typestate acknowledges that the runtime state of e.g. an object often determines which methods are safe to call. For example, for a Java
Conclusions and further work
It is our expectation that one day programming languages with built-in support for expressing and enforcing information-flow concerns become widely deployed. Paragon’s strong integration with Java and its relatively natural yet expressive policy specification language lowers the threshold for adopting information/flow aware programming outside the research community. Still, much work is left to be done before Paragon can become a serious competitor to existing programming.
It remains to be seen what the main bottleneck to using Paragon in practice will be. The space for exploration includes room for significant improvement in are such as
concurrency, the design of sufficiently simple policy APIs for specific domains (e.g. for Android apps), the construction of tools to support better programmer feedback and error comprehension, and the precision of the underlying analyses in the implementation.
The support for concurrency requires both theoretical and practical work, in particular if declassification mechanisms are shared among threads. Another planned direction is to present a more substantial formalisation of Paragon’s type system, including a proof of soundness with respect to information flow security.
Regarding the precision of the underlying analyses embedded in the implementation, such as aliasing, not-null, and various dataflow analyses, the present implementation does not attempt to implement the state of the art in precision. But it is far from clear that this is a major problem in practice. Paragon is not a utility for adapting legacy code, it is a new language. Programs can be written in a style that makes them easy to analyse, and has features which support scalability directly. For example, methods have explicit signatures which describe how they manipulate locks. This not only provides separate compilation (the approach to scalability employed by the underlying Java type system), but thanks to polymorphism they give a measure of context sensitivity which appears to be adequate in practice.
That said, we acknowledge a potentially new approach to implementing Paralocks, by leveraging the fact that, at its core, the implementation is essentially an analysis of whether the data dependencies in a program are consistent with a dynamically changing policy [52]. This suggests that one might be able to base the whole implementation on an analysis of pure dependencies, such as those computed by state-of-the-art program dependence graph tools e.g. [23,24].
Footnotes
Acknowledgments
Thanks to the reviewers and all members of the ProSec group at Chalmers for improving this paper with every iteration. We thank our colleagues Wolfgang Ahrendt, Pablo Buiras, Filippo Del Tedesco, Willard Rafnsson and Alejandro Russo for their valuable comments and feedback. Furthermore we would like to thank Jens Lideström, Shayan Najd Javadipour, Javed Nazir and Yannick Zakowski for assisting in the implementation of the Paragon compiler. This work is partly funded by the Swedish agencies SSF and VR, and Websand EC FP7-ICT-STREP.
