Abstract
We propose a Constraint Logic Programming approach for synthesizing block-structured scheduling processes with ordering constraints. Then we extend the model to allow specification of resource constraints. Our goal is to design optimization algorithms. We combine block structured modeling of business processes with results from project scheduling literature. Differently from standard approaches, here we focus on block structured scheduling processes. Our main achievement is the formulation of an abstract mathematical model of block-structured resource-constrained scheduling processes. We tested the correctness and feasibility of our approach using an experimental prototype based on Constraint Logic Programming developed using ECLiPSe-CLP system.
Introduction
Business process modeling attracted a lot of research interests in information systems and software engineering communities. Process representations have many applications in related areas including manufacturing, scheduling, software engineering, parallel computing, and process mining. Novel methods based on declarative specification of process requirements were proposed in the literature of business process management [27].
There are many modelings of business processes [21]. One can distinguish between graph-based unstructured and block-structured modelings. Block-structured process models, in particular process trees, have certain advantages compared with other approaches, regarding the flexibility, correctness and robustness of the models [15]. The problem of synthesizing a block-structured process models from declarative specifications has only been recently approached [22,23].
Scheduling is a traditional problem in Computer Science with applications in many different areas including manufacturing, multiprocessor systems, and project management [14]. The problem asks for finding feasible schedules of given set of activities taking into account various constraints, like activity precedence, resource availability, and activity modes, and that optimize given criteria, like for example total completion time. This problem is generally known to be intractable [34] and thus it has attracted a lot of research interest for developing new methods involving Artificial Intelligence approaches [26].
Standard schedules usually have a tight dependence on the duration of their enclosed tasks, i.e. if a task takes a longer time than initially stated then the resulting schedule might fail to satisfy the ordering constraints, i.e. schedule adaptation or even rescheduling are necessary. This limitation in the flexibility of schedules could be a serious drawback in many practical applications where failing to comply to the scheduling constraints could lead to disastrous consequences.
On the other hand, business process models have the advantage that they represent template processes satisfying the correctness requirements independently of activity durations. Therefore a process model is a kind of generic behavioral recipe that can be reused in various situations, without the need to worry about the failure of compliance with problem constraints. So it should be now obvious that bringing together project scheduling from project management community with process modeling from business process management community might exhibit certain important advantages regarding the increased flexibility of resulting schedules that would be valid in far more general contexts.
This explains the current interest in developing new methods for capturing more flexible and robust schedules based on block structured business process models satisfying a given set of constraints [22,23]. Such representations of block-structured schedules will be called in what follows block-structured scheduling processes. It was shown that manual construction of such non-trivial models is almost impossible or at least not scalable. Therefore, the interest has shifted to developing new methods for automated generation of block structured scheduling processes based on suitable explorations of the space of possible solutions.
In this paper we are interested in determining optimal or at least, as efficient as possible, block-structured scheduling processes that satisfy a given set of constraints. The optimization criterion requires the minimization of the total completion time (makespan in what follows). The constraints include ordering constraints imposing precedence relations between activities, as well as resource constraints, restricting resource availability, similarly to traditional project scheduling [14]. We assume that: i) our processes are using only sequential and parallel composition; ii) each activity must have exactly one instance in the schedule; iii) the representation formalism is based on suitable chosen subset of process trees [15].
We propose a declarative representation of process trees based on Constraint Logic Programming (CLP hereafter). Differently from standard approaches [14] and previous works [2,22], here we focus on the synthesis of block structured scheduling processes with both resource and ordering constraints. Our main achievement is the formulation of an abstract model of block-structured resource-constrained scheduling processes. The model has two components for activities and resources modeling. Scheduling processes must satisfy ordering constraints stating the precedence relations for executing pairs of activities. Following the proposal of [14], processes are constrained by availability of resources.
The correctness of the approach is theoretically assessed. We show how this representation can be used to experimentally explore non-trivial small-scale process models. On the other hand, the experiments revealed one (not surprising) weakness of the approach: despite its intellectual charm, it is too slow, thus hindering larger-scale experiments.
Therefore we further propose and experimentally investigate an heuristic approach based on hierarchical decomposition of ordering specification graphs, inspired by [2]. However, differently from [2] where a single hierarchical decomposition process was evaluated, using the CLP approach we are able to explore a larger space of possible hierarchical decompositions for significantly larger problems, while still being able to reduce the project makespan.
Our results have practical value in the area of workflow automation, like for example manufacturing, administration and project management. The main benefit of this work is the ability to design optimal process templates based on block structured process modeling that ensure both optimal achievement of the business goal from the point of view of minimizing total makespan, as well as the structured schedule re-usability without the need to reschedule or to update the schedule if the actual duration of execution of some tasks will deviate from their planned duration.
This paper is a joint extended version of our previous publications [4,6]. The main content was revised, restructured and expanded, while an entirely new Section 2 dedicated to related works was added.
Related works
Results and achievements of our research can be successfully used in the area of business process management (BPM) with application in project scheduling [14]. While BPM is now a very broad and well-established subject [9], in this paper we focus only on a very specific problem that is related to optimization of block structured processes that are used for capturing flexible project schedules [22].
In what follows we provide an overview of relevant related works in BPM area, that will help to better position our results in this field. There are many reasons and approaches for modeling business processes, including analysis, design, integration, and enactment on one hand, as well as traditional process modeling, workflow modeling, object-oriented modeling, service oriented modeling, and rule-based modeling on the other hand [20,21,24]. Our work clearly fits to optimal design using traditional process modelling, in particular using block-structured rather than graph-based languages [15].
Our contribution is focused on the application of Logic Programming (LP in what follows) for computing optimal schedules of structured processes. Here we consider several approaches of the application of LP to optimization, as well as other more general Artificial Intelligence (AI) approaches for the optimal process design problem.
A knowledge-based approach for business process modeling and re-engineering called SHAMASH was proposed in [1]. Authors claim that SHAMASH can be used for process simulation and optimization (second goal is similar to ours). Nevertheless, no evidence for that is provided in this paper which is mostly focused on tool presentation, rather than experimental evaluation of its underlying algorithms.
A hybrid approach for capturing executable models of business processes based on automated planning and Inductive Logic Programming (ILP) is proposed in [10]. While conceptually attractive, the main drawback is that only toy problems were considered in that work, overall the paper lacking a more systematic evaluation of the proposed algorithms underlying the approach.
The more recent works [18,19] address the problem of using automated planning in BPM, in particular for the automated design of template-based process models. Nevertheless, while this approach is “based on declarative problem definition and the resulting templates guarantee sound concurrency in the execution of their activities and are reusable”, the process optimization aspect is not addressed by this approach. We believe that this aspect is essential for the achievement of an efficient business performance.
The general scheduling problem with ordering and resource constraints originates from the areas of multiprocessor and project scheduling. According to the early result [34], scheduling with precedence constraints is NP-complete. Moreover, the problems of scheduling with precedence and resource constraints are included into the standard catalogue [11] of NP-complete problems (problems SS9 and SS10).
Project scheduling is a classical problem in Operations Research. This problem is formally described in [14]. Moreover, standard benchmarks for different variants of the problem are available at [28].
Several intelligent computational methods were employed for developing efficient algorithms for solving different variants of scheduling problem. A comprehensive classification of Artificial Intelligence techniques for scheduling problems was proposed in [35]. It comprises the following classes of approaches: i) Fuzzy Logic; ii) Expert Systems; iii) Machine Learning; iv) Stochastic Local Search; and v) Constraint Programming – CP. Among strengths of CP there are: flexibility, declarative semantics, and advanced search algorithms, while weaknesses of CP include: unpredictable efficiency and high complexity of search algorithms. A new class of hybrid methods combining the strengths of different approaches and strategies was intensely investigated in research literature. A comprehensive survey and classification of such hybrid meta-heuristic approaches for resource constraint project scheduling is discussed in [26].
The problem of optimizing block structured schedules originates from [22,23] as an application to automatic synthesis and verification of industrial commissioning processes. On one hand, block structured processes provide greater flexibility and robustness. For example, a block structured schedule depends only on the ordering constraints, while an ordinary schedule [14] depends on activity durations, i.e. if these durations change then the schedule may turn to be invalid. On the other hand, block structured processes are less expressive than ordinary schedules, i.e. there exist ordinary schedules that cannot be directly mapped to a block structured representation [22].
An approximate optimization algorithm of block structured processes from declarative specifications given as precedence graph using modular decompositions of graphs was proposed in [22,23]. The algorithm was evaluated on a set of real world benchmarks from the area of manufacturing.
A Greedy algorithm for the automatic synthesis of block structured scheduling processes that satisfy given ordering constraints was proposed in [2]. Two heuristics that can be used with this algorithm were proposed: hierarchical decomposition and critical path. The experimental results suggested that the critical path heuristic performs better.
During the last decade a special attention was devoted to process synthesis from event logs, activity known as process mining. Block structured processes were chosen as the target of process mining [15] due to their claimed flexibility and robusteness.
We noticed three major and related logic-based declarative approaches for representation and solving of constraint satisfaction problems: Constraint Logic Programming (CLP) [25], Answer Set Programming (ASP) [16] and Boolean Satisfiability checking (SAT) [17]. Several combinatorial optimization problems were also approached using various LP methods, including CLP, ASP, and SAT.
ASP is a form of LP based on stable model semantics. For example, winner determination is an important task for preference aggregation occurring in combinatorial auctions and voting applications. New approaches based on ASP for encoding voting rules [8] and for capturing combinatorial auction knowledge [7] were proposed in the literature.
CLP is an extension of LP with specialized algorithms for constraint handling, including consistency maintenance and constraint propagation. A knowledge-based agent for brokering logistics services in a vehicle routing with pickup and delivery problem was proposed in [5]. The agent is using a CLP-based knowledge representation and reasoning method for determining optimal allocations of trucks to transportation orders. A formulation of the Maximal Clique problem as a SAT problem was proposed in [3]. The constraints were mapped to a CLP representation, and the resulted representation was fed to the ECLiPSe-CLP [25,30] for the experimental evaluation.
Preliminaries
In this section we define block-structured scheduling processes using the methodology and notation borrowed from [2]. This presentation includes the definition of process tress, ordering graphs, resource constraints, and optimal processes. Introduction of resource constraints follows the standard terminology from [14].
Process trees
Let us consider a finite nonempty set of activities Σ. A trace
A language
The sequential composition of two languages
This notation can be extended for a trace t and a language L as:
The parallel composition of two traces
For each nonempty trace t we have:
For each nonempty traces
The parallel composition
In what follows we focus on process models that represent sets of possible activity schedules. A schedule must contain exactly one instance of each activity and it is composed using sequential (→) and parallel (∥) operators.
For the definition of scheduling processes we must introduce the support set
A block-structured scheduling process is recursively defined as follows:
If a is an activity then a is also a process such that
If P and Q are processes such that
The language
We assume that operator ∥ has higher precedence and operator → has lower precedence. Both operators are associative, while ∥ is also commutative.
It is not difficult to observe that if P is a well-formed block-structured scheduling process then all its traces
Process trees can be graphically depicted either as binary trees or as block-structured flowcharts.
Let us consider process

Process tree of

From left to right: ordering graph
We can impose ordering (precedence) constraints of the activities of a process, based on domain-specific semantics. For example, if two activities are independent and there are enough resources to be allocated to each of them then those activities can be scheduled for parallel execution. On the contrary, if an activity a depends on the output produced by another activity b, then activity a can be scheduled for execution only after the completion of the activity b, i.e. there is a sequencing constraint between the execution order of activities a and b.
These precedence constraints between activities are specified using an activity ordering graph
Σ is the set of nodes and each node represents an activity in Σ.
Observe that for an activity ordering graph
Let
Let
The language
Let P be a scheduling process and let
The set of processes P such that Figure 2 shows an ordering graph
Processes are constrained by availability of the resources required for executing their activities. According to standard project scheduling literature [14], resources can be classified as: renewable, nonrenewable, and doubly constrained.
Renewable resources are available on a period-by-period basis. Per-period available quantity is assumed constant. Examples are: manpower, machines, fuel flow, space.
Nonrenewable resources are limited on a total project basis. There is a limited overall consumption quantity of a nonrenewable resource for the entire project. Examples are: money, energy, raw material.
Doubly constrained resources are limited on a total project basis, as well as per-period basis. Examples are: money if both project budget and per-period cash flow are limited; manpower if a skilled worker can spend only a limited number of periods on the project. Note that doubly constrained resources can be taken into account by appropriately extending sets of renewable and nonrenewable resources.
Let
For each renewable resource
For each nonrenewable resource
Each process P consumes
If
Functions
If a resource
Time can be also considered a (nonrenewable) resource. If two processes are not constrained in any way and can be grouped in parallel then duration of the resulted process is equal to the maximum of the durations of each process, i.e.:
For each
Optimal process
Each activity execution consumes a positive real time, so “time” can be reasonably assimilated as a nonrenewable resource denoted with
The minimum duration of execution of a process that satisfies a given ordering graph
In what follows we consider that there exists a single nonrenewable resource “time”, i.e.
There is a finite and nonempty set of processes that satisfy an ordering graph
Moreover, as there is an exponential number of candidate processes satisfying
Relational models of optimal process trees
In this section we introduce our own representation model of optimal process trees using tools provided by constraint modeling. This is a core representation, so we ignore resource constraints. They will be later on added on top of this core model. Moreover, as optimization criteria we consider the shortest makespan.
Relational representation of process trees
Let us assume that our finite nonempty set of activities Σ contains
Sequence
If
If
If
The process tree
Let us define a set of constraints that must be imposed to ensure that sequence V of elements of set
A process tree contains n leaf nodes representing activities and
Any two distinct leaf nodes must have distinct activity labels (see equation (11)).
A subsequence of V rooted at node i representing a process subtree consists of either a single activity node (
The operator node i (
The subsequence of V representing the left subtree of tree rooted at i, followed by,
The subsequence of V representing the right subtree of tree with root i.
This constraint can be defined by introducing additional decision variables in our model represented by sequence L of m elements such that
The value of sequence L for process tree
One can easily notice that always
Using L, the definition of the constraint on V stating that operator-rooted sub-sequences of V represent well-formed binary process sub-trees is given by equations (12).
It is not difficult to observe that the same process can be represented by multiple distinct process trees. This happens because sequential composition operator → is both left and right associative, while parallel composition operator ∥ is both left and right associative and commutative.
Process terms
Firstly, we can restrict a process tree
Similarly, we can restrict a process tree
Finally, to deal with the commutativity of parallel composition ∥, for each process
A more explicit formulation of constraint (15) is deferred for the next subsection.
We now define the formal requirements of process trees represented as process terms that satisfy a set of ordering constraints. Analysing separately the situations when the process term is built using the sequential or parallel composition operator, we obtain an explicit characterization of conditions when a certain process tree satisfies the given ordering constraints. These conditions are stated by Proposition 1.
Let
If
If
The convenient formulation of the constraints resulted from Proposition 1 requires the extension of our model with additional decision variables representing the support sets of each subprocess of a given process.
For each
The support sets for process tree
Support sets are introduced using equations (16).
For the specification of the ordering constraints let us assume that the numbering of the nodes of ordering graph
The precedence constraints are captured using the adjacency matrix B of G. B is an upper triangular
Finally, constraint (15) can be made more explicit by introducing support sets, resulting equation (18).
Our proposed representation model is correct and complete. This result is precisely stated by Proposition 2. Basically this means that if a process satisfies the ordering constraints then it has a representation in our model that satisfies the equations (completeness) and conversely, if a representation in our model satisfies the equations then it defines a process that satisfies the ordering constraints (correctness). Let Σ be a set of activities, let
Let us assume that durations of activities are represented using a sequence D of n positive real numbers.
For computing costs we introduce a vector C of m decision variables such that
So, the duration
Hierarchical decomposition processes
The set of candidate process trees that can be defined for n activities, by ignoring ordering constraints and counting also redundant process trees (i.e. by omitting pruning constraints) is equal to the number of well-formed binary trees with n leaves (Catalan number of order
So, rather than searching this huge space of feasible solutions, an alternative approach would be to define and then explore a significantly smaller subspace, that still contains reasonable suboptimal processes.
In this section we follow the idea of hierarchical decomposition of the ordering graph initially introduced in [2]. However, instead of defining a single hierarchical decomposition process, we use CLP to define the subspace of all hierarchical decomposition processes that are “consistent” with the hierarchical decomposition of the ordering graph and then we explore it to pickup the optimal hierarchical decomposition process, actually obtaining a suboptimal solution for our problem.
Subspace of hierarchical decomposition processes
Let
For each node
For each node
Let
Considering the graph
Now, rather than defining a single hierarchical decomposition of
A hierarchical decomposition mapping of ordering graph
Note that the set
For each mapping
Let
Then
We are interested to determine the optimal hierarchical decomposition process defined using equations (22).
Further pruning rule
In this section we introduce an additional pruning rule based on the height of an ordering graph introduced in the previous subsection. It states an upper bound for the number of parallel compositions and it can be useful to prune the search of an optimal process that satisfies the ordering graph.
Let
(Maximum Number of Parallel Composition Operators).
Relational representation of hierarchical decomposition processes
Assuming that Figure 2 presents two hierarchical decomposition processes
The process
Let
If
Computational experiments
We have performed two classes of computational experiments with our models introduced in Section 4. The first class of experiments involves models without resource constraints. The second class of experiments involves models with renewable resource constraints. In both classes we are interested in the minimization of the process makespan. So, according to the discussion in Section 3.3, we consider that “time” is the only nonrenewable resource available in all our experiments.
The first class of experiments without resource constraints is further sub-divided in other two sub-classes as follows:
Experiments for assessing the correctness of the CLP model of process trees with ordering constraints.
Experiments for exploring the subspace of hierarchical decomposition processes defined by an ordering graph.
The second class of experiments was aimed to assess the correctness and feasibility of our approach by exploring the subspace of hierarchical decomposition processes defined by an ordering graph and satisfying the resource constraints.
Experimental setup
In this subsection we introduce the tools and data sets used in our experiments.
Environment
We have used the 64-bit version of ECLiPSe-CLP [33] 7.0 #44 on an x64-based PC with a 2 cores / 4 threads Intel© Core™i7-5500U CPU at 2.40 GHz running Windows 10. We extracted the timing information using statistics(hr_time,Time) system predicate that determines the value of a high-resolution timer in seconds [33].
We have developed two experimental CLP programs,2
The complete ECLiPSe-CLP code and the data sets that we have used in our experiments can be downloaded from
Declarative loops for the implementation of model formulas involving universal quantifiers [29].
Reified constraints that allow to mix integer constraints with Boolean constraints by reifying truth values as 0 and 1.
We have tried both integer arithmetic constraint solver IC that is a native part of ECLiPSe-CLP, as well as Gecode generic constraint solver [31] version 4.4.0 that is incorporated into ECLiPSe-CLP as an external library. Nevertheless, most of the experiments were carried out with Gecode, as we realized it is faster than IC for our models.
Note that we have used both exhaustive, as well as branch-and-bound search, to explore the set of solutions for small scale problems, and to determine suboptimal solutions for larger problems [25].
First Data Set. For the first class of experiments we generated a number of random DAGs of increasing size representing ordering constraints, as well as random durations of execution for each activity of the graph. The parameters of each instance of this data set are: number n of graph nodes, number
The DAGs were generated for the following values of the parameters:
Each data set instance was saved into Prolog format with schema presented in Listing 1.

Scheduling problem given as a set of Prolog facts
Note that these sizes of our data set are significantly smaller than those used in our previous work [2]. This is explained by the fact that here we are using generic constraint-based solvers, that are aimed to work with arbitrary constraint-based models, while in [2] we have used an heuristic algorithm, especially tailored for the problem in hand.
Second Data Set. For the second class of experiments we have applied our prototype CLP program to the j30 data set from [28]. This benchmark set contains 480 resource-constrained project scheduling problems, each problem involving the optimization of the total makespan of projects with 30 activities and 4 renewable constraints.
Each data set was converted from .RCP format into Prolog with schema presented in Listing 2.

Resource constraint scheduling problem given as a set of Prolog facts
Correctness experiments
The first set of experiments was focused on the correctness of the model introduced by Proposition 2. The model was manually compiled to ECLiPSe-CLP and was enhanced with pruning rule (23) and bounding rule (26), where l is the level mapping (see equation (20)).
These were small scale experiments, for graphs of sizes
Queries for the first set of experiments
Queries for the first set of experiments
Results for graphs of size
Table 2 presents some of the results obtained for graphs with
The last row of Table 2 presents the total time consumed by running a given query for the whole set of 19 graphs (note however that results for only 12 of them are shown here). We can observe that the smallest execution time was obtained for query 1. This result was somehow expected as query 1 represents a direct and unique call to Gecode using 4 threads, and our experiments were performed on a machine with 2 cores / 4 threads, so this setting can be considered optimal from this perspective. For example, query 4 uses only 2 threads, while query 5 is compiled into a sequence of calls to Gecode, using a default configuration (i.e. number of threads could not be explicitly set). The largest execution time was recorded by query 2, that used the IC solver.
Results for graphs of size
Table 3 presents some of the results obtained for graphs with
Missing values in column
In this section we present experimental results obtained using the hierarchical decomposition model. This model is simpler and it allows to compute suboptimal solutions for significantly larger problems than the complete model. It involved the use of query 8 presented in Table 1. The outcome is represented by the array
We performed experiments on a set of graphs of larger size
Results for graphs of size
Results for graphs of size
We have applied our prototype CLP program to the j30 data set from [28]. This benchmark set contains 480 resource-constrained project scheduling problems, each problem involving the optimization of the total makespan of projects with 30 activities and 4 renewable constraints.
The total time for processing our data set was

Experimental results for problems 100–199 of the j30 data set showing problem (X axis) vs project makespan (Y axis).
Taking into account that we did not have the optimal makespan values for our sample data set (PSPLIB contains optimal values for the provided data sets, but only for unstructured optimal schedules), we compared our results with the following measures (see Fig. 3 that presents the results obtained for problems 100–199 of j30 data set):
The critical path of the set of activities [13] denoted with
The costs associated to the hierarchical decomposition processes defined by mappings ℓ and
The subptimal cost obtained using our heuristic approach, denoted with
The lower and upper bounds of the costs, representing the sum of durations of all the activities (
We have developed CLP models for optimization of block structured scheduling processes. The correctness of the models was theoretically assessed. These models were fed to a CLP solver to explore the space of solutions. The results shown that, despite its intellectual charm, CLP is sometimes too slow compared with other approaches, thus hindering larger-scale experiments. Nevertheless, careful identification of interesting subspaces of potential solutions might drastically prune the search, enabling CLP to produce suboptimal solutions of practical value.
We have also extended our formal model of block structured scheduling processes with resource constraints, following ideas from traditional project scheduling literature. We have presented experimental results obtained using a Constraint Logic Programming prototype and an heuristic search algorithm derived by combining hierarchical decomposition of directed acyclic graphs with bin packing. We have used a standard benchmark data set from the project scheduling literature.
As future work we plan to strengthen our results by expanding the experimental evaluation, possibly with different solvers, and using other search strategies. We also envision possibly extending the model to include multiple mode activities, as typically encountered in classical project scheduling.
