Block structured scheduling using constraint logic programming

Abstract

We propose a Constraint Logic Programming approach for synthesizing block-structured scheduling processes with ordering constraints. Then we extend the model to allow specification of resource constraints. Our goal is to design optimization algorithms. We combine block structured modeling of business processes with results from project scheduling literature. Differently from standard approaches, here we focus on block structured scheduling processes. Our main achievement is the formulation of an abstract mathematical model of block-structured resource-constrained scheduling processes. We tested the correctness and feasibility of our approach using an experimental prototype based on Constraint Logic Programming developed using ECLiPSe-CLP system.

Keywords

Planning & scheduling constraint logic programming optimization

1. Introduction

Business process modeling attracted a lot of research interests in information systems and software engineering communities. Process representations have many applications in related areas including manufacturing, scheduling, software engineering, parallel computing, and process mining. Novel methods based on declarative specification of process requirements were proposed in the literature of business process management [27].

There are many modelings of business processes [21]. One can distinguish between graph-based unstructured and block-structured modelings. Block-structured process models, in particular process trees, have certain advantages compared with other approaches, regarding the flexibility, correctness and robustness of the models [15]. The problem of synthesizing a block-structured process models from declarative specifications has only been recently approached [22,23].

Scheduling is a traditional problem in Computer Science with applications in many different areas including manufacturing, multiprocessor systems, and project management [14]. The problem asks for finding feasible schedules of given set of activities taking into account various constraints, like activity precedence, resource availability, and activity modes, and that optimize given criteria, like for example total completion time. This problem is generally known to be intractable [34] and thus it has attracted a lot of research interest for developing new methods involving Artificial Intelligence approaches [26].

Standard schedules usually have a tight dependence on the duration of their enclosed tasks, i.e. if a task takes a longer time than initially stated then the resulting schedule might fail to satisfy the ordering constraints, i.e. schedule adaptation or even rescheduling are necessary. This limitation in the flexibility of schedules could be a serious drawback in many practical applications where failing to comply to the scheduling constraints could lead to disastrous consequences.

On the other hand, business process models have the advantage that they represent template processes satisfying the correctness requirements independently of activity durations. Therefore a process model is a kind of generic behavioral recipe that can be reused in various situations, without the need to worry about the failure of compliance with problem constraints. So it should be now obvious that bringing together project scheduling from project management community with process modeling from business process management community might exhibit certain important advantages regarding the increased flexibility of resulting schedules that would be valid in far more general contexts.

This explains the current interest in developing new methods for capturing more flexible and robust schedules based on block structured business process models satisfying a given set of constraints [22,23]. Such representations of block-structured schedules will be called in what follows block-structured scheduling processes. It was shown that manual construction of such non-trivial models is almost impossible or at least not scalable. Therefore, the interest has shifted to developing new methods for automated generation of block structured scheduling processes based on suitable explorations of the space of possible solutions.

In this paper we are interested in determining optimal or at least, as efficient as possible, block-structured scheduling processes that satisfy a given set of constraints. The optimization criterion requires the minimization of the total completion time (makespan in what follows). The constraints include ordering constraints imposing precedence relations between activities, as well as resource constraints, restricting resource availability, similarly to traditional project scheduling [14]. We assume that: i) our processes are using only sequential and parallel composition; ii) each activity must have exactly one instance in the schedule; iii) the representation formalism is based on suitable chosen subset of process trees [15].

We propose a declarative representation of process trees based on Constraint Logic Programming (CLP hereafter). Differently from standard approaches [14] and previous works [2,22], here we focus on the synthesis of block structured scheduling processes with both resource and ordering constraints. Our main achievement is the formulation of an abstract model of block-structured resource-constrained scheduling processes. The model has two components for activities and resources modeling. Scheduling processes must satisfy ordering constraints stating the precedence relations for executing pairs of activities. Following the proposal of [14], processes are constrained by availability of resources.

The correctness of the approach is theoretically assessed. We show how this representation can be used to experimentally explore non-trivial small-scale process models. On the other hand, the experiments revealed one (not surprising) weakness of the approach: despite its intellectual charm, it is too slow, thus hindering larger-scale experiments.

Therefore we further propose and experimentally investigate an heuristic approach based on hierarchical decomposition of ordering specification graphs, inspired by [2]. However, differently from [2] where a single hierarchical decomposition process was evaluated, using the CLP approach we are able to explore a larger space of possible hierarchical decompositions for significantly larger problems, while still being able to reduce the project makespan.

Our results have practical value in the area of workflow automation, like for example manufacturing, administration and project management. The main benefit of this work is the ability to design optimal process templates based on block structured process modeling that ensure both optimal achievement of the business goal from the point of view of minimizing total makespan, as well as the structured schedule re-usability without the need to reschedule or to update the schedule if the actual duration of execution of some tasks will deviate from their planned duration.

This paper is a joint extended version of our previous publications [4,6]. The main content was revised, restructured and expanded, while an entirely new Section 2 dedicated to related works was added.

2. Related works

Results and achievements of our research can be successfully used in the area of business process management (BPM) with application in project scheduling [14]. While BPM is now a very broad and well-established subject [9], in this paper we focus only on a very specific problem that is related to optimization of block structured processes that are used for capturing flexible project schedules [22].

In what follows we provide an overview of relevant related works in BPM area, that will help to better position our results in this field. There are many reasons and approaches for modeling business processes, including analysis, design, integration, and enactment on one hand, as well as traditional process modeling, workflow modeling, object-oriented modeling, service oriented modeling, and rule-based modeling on the other hand [20,21,24]. Our work clearly fits to optimal design using traditional process modelling, in particular using block-structured rather than graph-based languages [15].

Our contribution is focused on the application of Logic Programming (LP in what follows) for computing optimal schedules of structured processes. Here we consider several approaches of the application of LP to optimization, as well as other more general Artificial Intelligence (AI) approaches for the optimal process design problem.

A knowledge-based approach for business process modeling and re-engineering called SHAMASH was proposed in [1]. Authors claim that SHAMASH can be used for process simulation and optimization (second goal is similar to ours). Nevertheless, no evidence for that is provided in this paper which is mostly focused on tool presentation, rather than experimental evaluation of its underlying algorithms.

A hybrid approach for capturing executable models of business processes based on automated planning and Inductive Logic Programming (ILP) is proposed in [10]. While conceptually attractive, the main drawback is that only toy problems were considered in that work, overall the paper lacking a more systematic evaluation of the proposed algorithms underlying the approach.

The more recent works [18,19] address the problem of using automated planning in BPM, in particular for the automated design of template-based process models. Nevertheless, while this approach is “based on declarative problem definition and the resulting templates guarantee sound concurrency in the execution of their activities and are reusable”, the process optimization aspect is not addressed by this approach. We believe that this aspect is essential for the achievement of an efficient business performance.

The general scheduling problem with ordering and resource constraints originates from the areas of multiprocessor and project scheduling. According to the early result [34], scheduling with precedence constraints is NP-complete. Moreover, the problems of scheduling with precedence and resource constraints are included into the standard catalogue [11] of NP-complete problems (problems SS9 and SS10).

Project scheduling is a classical problem in Operations Research. This problem is formally described in [14]. Moreover, standard benchmarks for different variants of the problem are available at [28].

Several intelligent computational methods were employed for developing efficient algorithms for solving different variants of scheduling problem. A comprehensive classification of Artificial Intelligence techniques for scheduling problems was proposed in [35]. It comprises the following classes of approaches: i) Fuzzy Logic; ii) Expert Systems; iii) Machine Learning; iv) Stochastic Local Search; and v) Constraint Programming – CP. Among strengths of CP there are: flexibility, declarative semantics, and advanced search algorithms, while weaknesses of CP include: unpredictable efficiency and high complexity of search algorithms. A new class of hybrid methods combining the strengths of different approaches and strategies was intensely investigated in research literature. A comprehensive survey and classification of such hybrid meta-heuristic approaches for resource constraint project scheduling is discussed in [26].

The problem of optimizing block structured schedules originates from [22,23] as an application to automatic synthesis and verification of industrial commissioning processes. On one hand, block structured processes provide greater flexibility and robustness. For example, a block structured schedule depends only on the ordering constraints, while an ordinary schedule [14] depends on activity durations, i.e. if these durations change then the schedule may turn to be invalid. On the other hand, block structured processes are less expressive than ordinary schedules, i.e. there exist ordinary schedules that cannot be directly mapped to a block structured representation [22].

An approximate optimization algorithm of block structured processes from declarative specifications given as precedence graph using modular decompositions of graphs was proposed in [22,23]. The algorithm was evaluated on a set of real world benchmarks from the area of manufacturing.

A Greedy algorithm for the automatic synthesis of block structured scheduling processes that satisfy given ordering constraints was proposed in [2]. Two heuristics that can be used with this algorithm were proposed: hierarchical decomposition and critical path. The experimental results suggested that the critical path heuristic performs better.

During the last decade a special attention was devoted to process synthesis from event logs, activity known as process mining. Block structured processes were chosen as the target of process mining [15] due to their claimed flexibility and robusteness.

We noticed three major and related logic-based declarative approaches for representation and solving of constraint satisfaction problems: Constraint Logic Programming (CLP) [25], Answer Set Programming (ASP) [16] and Boolean Satisfiability checking (SAT) [17]. Several combinatorial optimization problems were also approached using various LP methods, including CLP, ASP, and SAT.

ASP is a form of LP based on stable model semantics. For example, winner determination is an important task for preference aggregation occurring in combinatorial auctions and voting applications. New approaches based on ASP for encoding voting rules [8] and for capturing combinatorial auction knowledge [7] were proposed in the literature.

CLP is an extension of LP with specialized algorithms for constraint handling, including consistency maintenance and constraint propagation. A knowledge-based agent for brokering logistics services in a vehicle routing with pickup and delivery problem was proposed in [5]. The agent is using a CLP-based knowledge representation and reasoning method for determining optimal allocations of trucks to transportation orders. A formulation of the Maximal Clique problem as a SAT problem was proposed in [3]. The constraints were mapped to a CLP representation, and the resulted representation was fed to the ECLiPSe-CLP [25,30] for the experimental evaluation.

3. Preliminaries

In this section we define block-structured scheduling processes using the methodology and notation borrowed from [2]. This presentation includes the definition of process tress, ordering graphs, resource constraints, and optimal processes. Introduction of resource constraints follows the standard terminology from [14].

3.1. Process trees

Let us consider a finite nonempty set of activities Σ. A trace $t \in Σ^{*}$ is a sequence of zero or more activities.1

¹
$Σ^{*}$ is the set of all sequences consisting of zero or more elements of Σ.

The length of a trace

t = a_{1} a_{2} \dots a_{n}

is n and is denoted as

| t | = n

. The empty trace is denoted by ε and

| ε | = 0

. For each nonempty trace

t = a_{1} a_{2} \dots a_{n}

we define: i) the head of t as

head (t) = a_{1}

, and ii) the tail of t as

tail (t) = a_{2} \dots a_{n}

A language $L \subseteq 2^{Σ^{*}}$ is defined as a set of traces. We can define certain operations with languages.

The sequential composition of two languages $L_{1}$ and $L_{2}$ denoted by $L_{1} \to L_{2}$ , is defined as follows: $\begin{matrix} (1) & L_{1} \to L_{2} = {w = l_{1} l_{2} ∣ l_{1} \in L_{1}, l_{2} \in L_{2}} \end{matrix}$

This notation can be extended for a trace t and a language L as: $t \to L = {t} \to L$ .

The parallel composition of two traces $t_{1}$ and $t_{2}$ , denoted by $t_{1} ∥ t_{2}$ , is defined as:

For each nonempty trace t we have: $t ∥ ε = ε ∥ t = {t}$

For each nonempty traces $t_{1}$ and $t_{2}$ we have: $\begin{matrix} (2) & \begin{matrix} t_{1} ∥ t_{2} \\ = (head (t_{1}) \to (tail (t_{1}) ∥ t_{2})) \\ \cup (head (t_{2}) \to (t_{1} ∥ tail (t_{2}))) \end{matrix} \end{matrix}$

The parallel composition $L_{1} ∥ L_{2}$ of two languages $L_{1}$ and $L_{2}$ is now defined as: $\begin{matrix} (3) & L_{1} ∥ L_{2} = ⋃_{t_{1} \in L_{1}, t_{2} \in L_{2}} t_{1} ∥ t_{2} \end{matrix}$

In what follows we focus on process models that represent sets of possible activity schedules. A schedule must contain exactly one instance of each activity and it is composed using sequential (→) and parallel (∥) operators.

For the definition of scheduling processes we must introduce the support set $supp (P)$ of a process P that represents the set of activities that occur in P. In what follows let us denote with $a, b, c, \dots$ the activities of Σ and with $P, Q, R, \dots$ process terms.

A block-structured scheduling process is recursively defined as follows:

If a is an activity then a is also a process such that $supp (a) = {a}$ .

If P and Q are processes such that $supp (P) \cap supp (Q) = \emptyset$ then $P \to Q$ and $P ∥ Q$ are processes with $supp (P \to Q) = supp (P ∥ Q) = supp (P) \cup supp (Q)$ .

The language $L (P)$ of process P contains all the possible traces of P according to the interleaving semantics and it is recursively defined as follows:

$L (a) = {a}$

$L (P \to Q) = L (P) \to L (Q)$

$L (P ∥ Q) = L (P) ∥ L (Q)$

We assume that operator ∥ has higher precedence and operator → has lower precedence. Both operators are associative, while ∥ is also commutative.

It is not difficult to observe that if P is a well-formed block-structured scheduling process then all its traces $t \in L (P)$ have the same length $| t | = | supp (P) |$ .

Process trees can be graphically depicted either as binary trees or as block-structured flowcharts.

Example 1.

Let us consider process $P = c \to (a ∥ b)$ shown in Fig. 1. Note that $supp (P) = {a, b, c}$ and $L (P) = {c a b, c b a}$ . Observe that all traces of this process have the same length $3 = | {a, b, c} |$ .

Fig. 1.

Process tree of $P = c \to (a ∥ b)$ (left) and its equivalent block-structured model (right).

Fig. 2.

From left to right: ordering graph $G_{1}$ , process $P_{1}$ , process $P_{2}$ , and process $P_{3}$ .

3.2. Ordering graph

We can impose ordering (precedence) constraints of the activities of a process, based on domain-specific semantics. For example, if two activities are independent and there are enough resources to be allocated to each of them then those activities can be scheduled for parallel execution. On the contrary, if an activity a depends on the output produced by another activity b, then activity a can be scheduled for execution only after the completion of the activity b, i.e. there is a sequencing constraint between the execution order of activities a and b.

These precedence constraints between activities are specified using an activity ordering graph $G = ⟨ Σ, E ⟩$ such that:

Σ is the set of nodes and each node represents an activity in Σ.

$E \subseteq Σ \times Σ$ is the set of edges. Each edge represents an ordering constraint. If $(u, v) \in E$ then activity v cannot occur in a schedule without being preceded by activity u.

Observe that for an activity ordering graph $G = ⟨ Σ, E ⟩$ , set E defines a partial ordering relation on Σ, i.e. it is transitive and antisymmetric, so it cannot define cycles. In standard project scheduling terminology, graph $G$ is known as activity-on-node network [14] and it is a directed acyclic graph (DAG hereafter).

Let $t = a_{1} a_{2} \dots a_{n}$ be a trace of a scheduling process and let u and v be two activities occurring in t. Then u precedes v in t, denoted as $u \overset{t}{\to} v$ if there are $1 ⩽ i < j ⩽ n$ such that $a_{i} = u$ and $a_{j} = v$ .

Let $G = ⟨ Σ, E ⟩$ be an ordering graph and let t be a trace containing all the activities of Σ with no repetition (i.e. a permutation of Σ). Then t satisfies $G$ , written as $t ⊧ G$ , if and only if $E \subseteq \overset{t}{\to}$ . This means that trace t cannot contain activities ordered differently than as specified by $G$ .

The language $L (G)$ of an ordering graph $G$ is the set of all traces that satisfy $G$ , i.e: $\begin{matrix} (4) & L (G) = {t ∣ t ⊧ G} \end{matrix}$

Let P be a scheduling process and let $G = ⟨ Σ, E ⟩$ be an ordering graph. P satisfies $G$ written as $P ⊧ G$ , if and only if:

$L (P) \subseteq L (G)$ , i.e. each trace of P satisfies $G$ , and

$supp (P) = Σ$ , i.e. all the activities of Σ are relevant and occur in P.

The set of processes P such that $P ⊧ G$ is nonempty, as it contains at least one sequential process defined by the topological sorting of $G$ .

Example 2.
Figure 2 shows an ordering graph $G_{1}$ , and three processes $P_{1}$ , $P_{2}$ and $P_{3}$ .The total number of possible traces for the set of activities $Σ = {a, b, c}$ is $3! = 6$ . Moreover, $L (G_{1}) = {a b c, a c b, c a b}$ . Observe also that $L (P_{1}) = {a c b, c a b}$ , $L (P_{2}) = {a b c, a c b}$ , and $L (P_{3}) = {c a b, a c b, a b c}$ showing that $G_{1} ⊧ P_{i}$ for all $i = 1, 2, 3$ . However, if P is the process from Fig. 1, $L (P) = {c a b, c b a}$ observe that $L (P) ⊈ L (G_{1})$ , so $P ⊭ G_{1}$ .

3.3. Resource constraints

Processes are constrained by availability of the resources required for executing their activities. According to standard project scheduling literature [14], resources can be classified as: renewable, nonrenewable, and doubly constrained.

Renewable resources are available on a period-by-period basis. Per-period available quantity is assumed constant. Examples are: manpower, machines, fuel flow, space.

Nonrenewable resources are limited on a total project basis. There is a limited overall consumption quantity of a nonrenewable resource for the entire project. Examples are: money, energy, raw material.

Doubly constrained resources are limited on a total project basis, as well as per-period basis. Examples are: money if both project budget and per-period cash flow are limited; manpower if a skilled worker can spend only a limited number of periods on the project. Note that doubly constrained resources can be taken into account by appropriately extending sets of renewable and nonrenewable resources.

Let $R$ and $N$ be the sets of renewable and nonrenewable resources. We assume that:

For each renewable resource $r \in R$ its per period capacity is $ρ_{r}$ and each activity $a \in Σ$ consumes $ρ_{a, r}$ units of r.

For each nonrenewable resource $n \in N$ its overall capacity is $ν_{n}$ and each activity $a \in Σ$ consumes $ν_{a, n}$ units of n.

Each process P consumes $ρ (P, r)$ units of renewable resource r and $ν (P, n)$ units of nonrenewable resource n. Functions ρ and ν can be defined compositionally as follows:

If $a \in Σ$ then $ρ (a, r) = ρ_{a, r}$ and $ν (a, n) = ν_{a, n}$ .

$ρ (P \to Q, r) = max (ρ (P, r), ρ (Q, r))$ , $ν (P \to Q, n) = ν (P, n) + ν (Q, n)$

$ρ (P ∥ Q, r) = f_{r}^{∥} (ρ (P, r), ρ (Q, r))$ , $ν (P ∥ Q, n) = f_{n}^{∥} (ν (P, n), ν (Q, n))$

Functions $f_{r}^{∥}$ and $f_{n}^{∥}$ describe resource consumption of processes executed in parallel. Typically, they are resource specific and generally they are sub-additive. This means that when several processes are grouped together in parallel they could consume at most (sometimes strictly less) resources than the sum of their individual consumptions. For example, when several virtual machines are packed on a server, their total memory requirement could decrease due to pages shared by all of them that need to be stored once [32]. Typical examples are:

If a resource $r \in R \cup N$ is not shared at all when processes are grouped in parallel then its consumption can be described by an additive function: $\begin{matrix} (5) & f_{r}^{∥} (q_{1}, q_{2}) = q_{1} + q_{2} \end{matrix}$

Time can be also considered a (nonrenewable) resource. If two processes are not constrained in any way and can be grouped in parallel then duration of the resulted process is equal to the maximum of the durations of each process, i.e.: $\begin{matrix} (6) & f_{time}^{∥} (t_{1}, t_{2}) = max (t_{1}, t_{2}) \end{matrix}$

For each $r \in R$ and $n \in N$ , resource constraints for process P can be now defined by the following inequalities: $\begin{matrix} (7) & \begin{array}{l} ρ (P, r) ⩽ ρ_{r} \\ ν (P, n) ⩽ ν_{n} \end{array} \end{matrix}$

3.4. Optimal process

Each activity execution consumes a positive real time, so “time” can be reasonably assimilated as a nonrenewable resource denoted with $time$ . The duration of execution (or makespan) $d (P)$ of a process P is defined as follows: $\begin{matrix} (8) & d (P) = ν (P, time) \end{matrix}$

The minimum duration of execution of a process that satisfies a given ordering graph $G$ , as well as a given set of resource constraints is denoted with $d_{OPT} (G)$ . The optimal scheduling process $P^{*}$ is defined as follows: $\begin{matrix} (9) & d (P^{*}) = d_{OPT} (G) \end{matrix}$

In what follows we consider that there exists a single nonrenewable resource “time”, i.e. $N = {time}$ . This is actually used for defining the optimization criterion as the shortest makespan. So our process is constrained only by renewable resources $R$ .

There is a finite and nonempty set of processes that satisfy an ordering graph $G$ , so the optimal scheduling process trivially exists.

Moreover, as there is an exponential number of candidate processes satisfying $G$ (this will be shown in Section 5), we postulate that the computation of the optimal scheduling process is generally an intractable problem. Therefore, we will be focusing on developing heuristic algorithms that are able to produce “suboptimal” or “good enough” scheduling processes using a reasonable computational effort.

4. Relational models of optimal process trees

In this section we introduce our own representation model of optimal process trees using tools provided by constraint modeling. This is a core representation, so we ignore resource constraints. They will be later on added on top of this core model. Moreover, as optimization criteria we consider the shortest makespan.

4.1. Relational representation of process trees

Let us assume that our finite nonempty set of activities Σ contains $n ⩾ 1$ elements represented by integers $1, 2, \dots, n$ . A process tree can be uniquely captured as a sequence V of $m = 2 n - 1$ elements of $Σ^{'} = {- 1, 0, 1, 2, \dots, n}$ such that:

Sequence $V_{1}, V_{2}, \dots, V_{m}$ is the preorder traversal of the process tree.

If $1 ⩽ V_{i} ⩽ n$ then $V_{i}$ represents an activity, i.e. a leaf of the process tree.

If $V_{i} = 0$ then $V_{i}$ represents the sequential composition operator →.

If $V_{i} = - 1$ then $V_{i}$ represents the parallel composition operator ∥.

Example 3.
The process tree $P_{1}$ from Fig. 2 with $n = 3$ activities ${a, b, c}$ is represented by sequence $V = [0, - 1, 1, 3, 2]$ of $m = 2 n - 1 = 5$ elements, assuming that a is mapped to 1, b to 2 and c to 3.

Let us define a set of constraints that must be imposed to ensure that sequence V of elements of set $Σ^{'}$ defines the preorder of the process tree.

A process tree contains n leaf nodes representing activities and $n - 1$ internal nodes representing composition operators (see equation (10); note that Boolean values “true” and “false” are reified as algebraic values 1 and 0). $\begin{matrix} (10) & \sum_{i = 1}^{m} (V_{i} < 1) = n - 1 \end{matrix}$

Any two distinct leaf nodes must have distinct activity labels (see equation (11)). $\begin{matrix} (11) & \begin{matrix} (\forall i : 1 \dots m) (\forall j : i + 1 \dots m) \\ ((V_{i} = V_{j}) \Rightarrow (V_{i} < 1)) \end{matrix} \end{matrix}$

A subsequence of V rooted at node i representing a process subtree consists of either a single activity node ( $V_{i} ⩾ 1$ ) or it is composed of:

The operator node i ( $V_{i} < 1$ ), followed by,

The subsequence of V representing the left subtree of tree rooted at i, followed by,

The subsequence of V representing the right subtree of tree with root i.

This constraint can be defined by introducing additional decision variables in our model represented by sequence L of m elements such that $L_{i}$ is the number of nodes of the subtree of V rooted at i.
Example 4.
The value of sequence L for process tree $P_{1}$ shown in Fig. 2 is given by $L = [5, 3, 1, 1, 1]$ , while the value of L for process tree $P_{2}$ shown in Fig. 2 is given by $L = [5, 1, 3, 1, 1]$ . We assume that a is mapped to 1, b to 2 and c to 3.

One can easily notice that always $L_{i}$ is an odd number in the set ${1, \dots, m}$ , $L_{1} = m$ and $L_{m} = 1$ . Moreover, if $m > 1$ then $L_{m - 1} = 1$ .

Using L, the definition of the constraint on V stating that operator-rooted sub-sequences of V represent well-formed binary process sub-trees is given by equations (12). $\begin{matrix} (12) & \begin{array}{l} (\forall i : 1 \dots m) ((V_{i} ⩾ 1) \Rightarrow (L_{i} = 1)) \\ (\forall i : 1 \dots m - 2) ((V_{i} < 1) \Rightarrow \\ (L_{i} = 1 + L_{i + 1} + L_{1 + i + L_{i + 1}})) \end{array} \end{matrix}$
4.2. Pruning redundant trees

It is not difficult to observe that the same process can be represented by multiple distinct process trees. This happens because sequential composition operator → is both left and right associative, while parallel composition operator ∥ is both left and right associative and commutative.

Example 5.
Process terms $a_{1} ∥ (a_{2} ∥ a_{3})$ , $(a_{1} ∥ a_{2}) ∥ a_{3}$ , and $a_{1} ∥ (a_{3} ∥ a_{2})$ represent the same process.

Firstly, we can restrict a process tree $L \to R$ such that L is either a singleton activity process or a parallel composition $L^{'} ∥ L^{″}$ (see equation (13)). $\begin{matrix} (13) & \begin{matrix} (\forall i : 1 \dots m - 2) \\ ((V_{i} = 0) \Rightarrow (V_{i + 1} \neq 0)) \end{matrix} \end{matrix}$

Similarly, we can restrict a process tree $L ∥ R$ such that L is either a singleton activity process or a sequential composition $L^{'} \to L^{″}$ (see equation (14)). $\begin{matrix} (14) & \begin{matrix} (\forall i : 1 \dots m - 2) \\ ((V_{i} = - 1) \Rightarrow (V_{i + 1} ⩾ 0)) \end{matrix} \end{matrix}$

Finally, to deal with the commutativity of parallel composition ∥, for each process $P = L ∥ R$ we constrain the lowest rank activity from $supp (P)$ to always be a member of $supp (L)$ (see equation (15)). $\begin{matrix} (15) & \begin{matrix} (\forall P = L ∥ R) \\ (min {i ∣ i \in supp (P)} \\ = min {i ∣ i \in supp (L)}) \end{matrix} \end{matrix}$

A more explicit formulation of constraint (15) is deferred for the next subsection.
4.3. Enforcing ordering constraints

We now define the formal requirements of process trees represented as process terms that satisfy a set of ordering constraints. Analysing separately the situations when the process term is built using the sequential or parallel composition operator, we obtain an explicit characterization of conditions when a certain process tree satisfies the given ordering constraints. These conditions are stated by Proposition 1.

Proposition 1.
Let $P = L \to R$ or $P = L ∥ R$ a process, $supp (L) = Σ_{L}$ , $supp (R) = Σ_{R}$ , $supp (P) = Σ$ , and let $G = ⟨ Σ, E ⟩$ be an ordering graph. Then $P ⊧ G$ if and only if:
$Σ_{L} \cup Σ_{R} = Σ$ and $Σ_{L} \cap Σ_{R} = \emptyset$ .

$L ⊧ G_{L}$ and $R ⊧ G_{R}$ where $G_{L}$ and $G_{R}$ are subgraphs of $G$ induced by $Σ_{L}$ and $Σ_{R}$ .

If $P = L \to R$ then for each $(u, v) \in E$ either $u \in Σ_{L}$ or $v \in Σ_{R}$ .

If $P = L ∥ R$ then for each $(u, v) \in E$ either $u, v \in Σ_{L}$ or $u, v \in Σ_{R}$ .

The convenient formulation of the constraints resulted from Proposition 1 requires the extension of our model with additional decision variables representing the support sets of each subprocess of a given process.

For each $i = 1, 2, \dots, m$ let $S_{i} \subseteq 1 \dots n$ be the support set of subprocess represented by sequence V and rooted at i.
Example 6.
The support sets for process tree $P_{1}$ shown in Fig. 2 are $S_{1} = {1, 2, 3}$ , $S_{2} = {1, 3}$ , $S_{3} = {1}$ , $S_{4} = {3}$ , and $S_{5} = {2}$ . We assume that a is mapped to 1, b to 2 and c to 3.

Support sets are introduced using equations (16). $\begin{matrix} (16) & \begin{matrix} (\forall i : 1 \dots m) \\ S_{i} = {V_{j} > 0 ∣ i ⩽ j ⩽ i + L_{i} - 1} \end{matrix} \end{matrix}$

For the specification of the ordering constraints let us assume that the numbering of the nodes of ordering graph $G = ⟨ Σ, E ⟩$ is done according to one of the topological sortings of G. This basically means that the adjacency matrix of G is upper triangular.

The precedence constraints are captured using the adjacency matrix B of G. B is an upper triangular $n \times n$ matrix with elements in ${0, 1}$ . Using this notation, constraints defined by Proposition 1 are directly mapped to equations (17). $\begin{array}{rcl} (17) & \begin{array}{l} (\forall k : 1 \dots m - 2) (\forall i : 1 \dots n) (\forall j : i + 1 \dots n) \\ (B_{i j} \Rightarrow ((V_{k} = - 1) \Rightarrow ((i, j \in S_{k}) \Rightarrow \\ ((i, j \in S_{k + 1}) \lor (i, j \in S_{k + 1 + L_{k + 1}}))))) \\ (\forall k : 1 \dots m - 2) (\forall i : 1 \dots n) (\forall j : i + 1 \dots n) \\ (B_{i j} \Rightarrow ((V_{k} = 0) \Rightarrow ((i, j \in S_{k}) \Rightarrow \\ ((i \in S_{k + 1}) \lor (j \in S_{k + 1 + L_{k + 1}}))))) \end{array} \end{array}$

Finally, constraint (15) can be made more explicit by introducing support sets, resulting equation (18). $\begin{matrix} (18) & (\forall i : 1 \dots m - 2) min S_{i} = min S_{i + 1} \end{matrix}$

Our proposed representation model is correct and complete. This result is precisely stated by Proposition 2. Basically this means that if a process satisfies the ordering constraints then it has a representation in our model that satisfies the equations (completeness) and conversely, if a representation in our model satisfies the equations then it defines a process that satisfies the ordering constraints (correctness). Proposition 2.
Let Σ be a set of activities, let $G = (Σ, E)$ be an ordering graph and let P be a scheduling process with $supp (P) = Σ$ . Then $P ⊧ G$ if and only if P can be uniquely represented using a triple $(V, L, S)$ satisfying equations ( 10 ), ( 11 ), ( 12 ), ( 13 ), ( 14 ), ( 16 ), ( 17 ), and ( 18 ).

4.4. Computing optimization cost

Let us assume that durations of activities are represented using a sequence D of n positive real numbers.

For computing costs we introduce a vector C of m decision variables such that $C_{i}$ represents the cost of the subtree rooted at i.

So, the duration $d (P)$ of process P is defined according to equations (19), following the composition rules of $time$ nonrenewable resource (see Section 3.3): $\begin{matrix} (19) & \begin{array}{l} (\forall i : 1 \dots m) \\ C_{i} = \{\begin{matrix} D_{V_{i}} & if V_{i} > 0 \\ max (C_{i + 1}, C_{i + 1 + L_{i + 1}}) & if V_{i} = - 1 \\ C_{i + 1} + C_{i + 1 + L_{i + 1}} & if V_{i} = 0 \end{matrix} \\ d (P) = C_{1} \end{array} \end{matrix}$

5. Hierarchical decomposition processes

The set of candidate process trees that can be defined for n activities, by ignoring ordering constraints and counting also redundant process trees (i.e. by omitting pruning constraints) is equal to the number of well-formed binary trees with n leaves (Catalan number of order $n - 1$ ) times $2^{n - 1}$ (each of the $n - 1$ internal nodes can be labeled with $- 1$ or 0) times $n!$ (permutations of sequences of n leaves). This value is equal to $\frac{2^{n - 1}}{n} (\binom{2 n - 2}{n - 1}) n! = 2^{n - 1} \frac{(2 n - 2)!}{(n - 1)!}$ and it can grow too large even for small values of n.

So, rather than searching this huge space of feasible solutions, an alternative approach would be to define and then explore a significantly smaller subspace, that still contains reasonable suboptimal processes.

In this section we follow the idea of hierarchical decomposition of the ordering graph initially introduced in [2]. However, instead of defining a single hierarchical decomposition process, we use CLP to define the subspace of all hierarchical decomposition processes that are “consistent” with the hierarchical decomposition of the ordering graph and then we explore it to pickup the optimal hierarchical decomposition process, actually obtaining a suboptimal solution for our problem.

5.1. Subspace of hierarchical decomposition processes

Let $G = ⟨ Σ, E ⟩$ be an ordering graph.

For each node $v \in Σ$ we define the set $I (v)$ of input neighbors of v as $I (v) = {u \in Σ ∣ (u, v) \in E}$ , and the set $O (v)$ of output neighbors of v as $O (v) = {u \in Σ ∣ (v, u) \in E}$ .

For each node $v \in Σ$ we recursively define the level $l (v)$ of v as a function $l : Σ \to N$ introduced by equation (20). $\begin{matrix} (20) & \begin{array}{l} (\forall v \in Σ) if I (v) = \emptyset then l (v) = 0 \\ (\forall v \in Σ) if I (v) \neq \emptyset then \\ l (v) = 1 + max_{u \in I (v)} {l (u)} \end{array} \end{matrix}$

Let $h (G)$ be the height of graph $G$ , defined as $h (G) = {max}_{v \in Σ} {l (v)}$ .

Example 7.
Considering the graph $G_{1}$ from Fig. 2, we have: $l (a) = l (c) = 0$ , $l (b) = 1$ , and $h (G_{1}) = 1$ .

Now, rather than defining a single hierarchical decomposition of $G$ , induced by mapping l, we define the nonempty space of hierarchical decompositions of $G$ .

A hierarchical decomposition mapping of ordering graph $G$ is a function $p : Σ \to N$ such that: $\begin{matrix} (21) & \begin{array}{l} (\forall v \in Σ) if O (v) = \emptyset then p (v) ⩽ h (G) \\ (\forall v \in Σ) if I (v) \neq \emptyset then \\ p (v) ⩾ 1 + max_{u \in I (v)} {p (u)} \end{array} \end{matrix}$

Note that the set $H (G)$ of hierarchical decomposition mappings p satisfying equations (21) is not empty, as obviously mapping l is an element of this set.

For each mapping $p \in H (G)$ we can define a scheduling process $P_{HD} (G, p)$ that satisfies $G$ . This result is stated by the following proposition.
Proposition 3 (Hierarchical Decomposition Process).

Let $G = ⟨ Σ, E ⟩$ be an ordering graph and let $p \in H (G)$ . The hierarchical decomposition process $P_{HD} (G, p)$ associated to $G$ and p is defined as:

$Σ_{i} = {v ∣ p (v) = i}$ for all $i = 0, 1, \dots, h (G)$ .

$P_{i} = ∥_{v \in Σ_{i}} v$ for all $0 ⩽ i ⩽ h (G)$ .

$P_{HD} (G, p) = P_{0} \to P_{1} \to \dots \to P_{h (G)}$ .

Then $P_{HD} (G, p) ⊧ G$ .

We are interested to determine the optimal hierarchical decomposition process defined using equations (22). $\begin{matrix} (22) & \begin{array}{l} p^{*} = \underset{p \in H (G)}{arg min} d (P_{HD} (G, p)) \\ P_{HD} (G) = P_{HD} (G, p^{*}) \\ d_{HD} (G) = d (P_{HD} (G)) \end{array} \end{matrix}$

5.2. Further pruning rule

In this section we introduce an additional pruning rule based on the height of an ordering graph introduced in the previous subsection. It states an upper bound for the number of parallel compositions and it can be useful to prune the search of an optimal process that satisfies the ordering graph.

Proposition 4 (Maximum Number of Parallel Composition Operators).

Let $G = ⟨ Σ, E ⟩$ be an ordering graph, $n = | Σ |$ , and let P be a scheduling process such that $P ⊧ G$ . Then the number $N_{∥}$ of parallel composition operators that occur in P must satisfy the inequality ( 23 ): $\begin{matrix} (23) & N_{∥} ⩽ n - h (G) - 1 \end{matrix}$

5.3. Relational representation of hierarchical decomposition processes

Assuming that $h = h (G)$ is known, an hierarchical decomposition mapping (and process) can be represented with a sequence H of n decision variables with values in the set ${0, 1, \dots, h}$ . The optimal hierarchical decomposition process can be determined by minimizing variable $CHD$ defined by equations (24). $\begin{matrix} (24) & \begin{array}{l} (\forall i : 1 \dots n) ((O (i) = \emptyset) \Rightarrow (H_{i} ⩽ h)) \\ (\forall i : 1 \dots n) ((I (i) \neq \emptyset) \Rightarrow \\ (H_{i} ⩾ 1 + max_{j \in I (i)} {H_{j}})) \\ CHD = \sum_{i = 0}^{h} {max}_{j = 1}^{n} {D_{j} ∣ H_{j} = i} \end{array} \end{matrix}$

Example 8.
Figure 2 presents two hierarchical decomposition processes $P_{1}$ and $P_{2}$ for the ordering graph included in the figure. Their corresponding decomposition mappings are represented with sequences $H_{1} = [0, 1, 0]$ and $H_{2} = [0, 1, 1]$ , that define respectively the following partitions of graph nodes: ${{1, 3}, {2}}$ and ${{1}, {2, 3}}$ . We assume that a is mapped to 1, b to 2 and c to 3.

5.4. Adding resource constraints

The process $P_{HD} (G, p)$ might violate resource constraints. Fortunately, we can transform $P_{HD} (G, p)$ into a process $P_{HD}^{'} (G, p)$ that satisfies both ordering and resource constraints using a variant of the well known bin-packing problem, as described in [11] (problem SR1).

Let $Σ_{0}, Σ_{1}, \dots, Σ_{h}$ be the optimal hierarchical decomposition of Σ, i.e. $Σ_{l} = {a \in Σ ∣ p^{*} (a) = l}$ for all $l = 0, 1, \dots, h$ . Activities of $Σ_{i}$ are items that must be placed in bins representing a partition of $Σ_{i}$ such that the cost of the partition must be minimized. Each bin is multiply constrained by the upper bounds of the available renewable resources.

If $| Σ_{l} | = n_{l}$ then a nontrivial partition of $Σ_{l}$ will contain at most $n_{l}$ subsets. It can be represented using a vector $H^{l}$ of $n_{l}$ decision variables in range $1 \dots n_{l}$ . So we obtain a series of $l = 0, 1, \dots, h$ bin packing problems with constraints and the bins optimization cost ${Cost}_{l}$ defined by equations (25). $\begin{matrix} (25) & \begin{array}{l} (\forall l : 0 \dots h) (\forall i : 1 \dots n_{l}) (\forall r \in R) \\ \sum_{a \in Σ_{l}} (H_{a}^{l} = i) ρ_{a, r} ⩽ ρ_{r} \\ (\forall l : 0 \dots h) \\ {Cost}_{l} = \sum_{i = 1}^{n_{l}} max_{a \in Σ_{l}} (H_{a}^{l} = i) ρ_{a, time} \end{array} \end{matrix}$

6. Computational experiments

We have performed two classes of computational experiments with our models introduced in Section 4. The first class of experiments involves models without resource constraints. The second class of experiments involves models with renewable resource constraints. In both classes we are interested in the minimization of the process makespan. So, according to the discussion in Section 3.3, we consider that “time” is the only nonrenewable resource available in all our experiments.

The first class of experiments without resource constraints is further sub-divided in other two sub-classes as follows:

Experiments for assessing the correctness of the CLP model of process trees with ordering constraints.

Experiments for exploring the subspace of hierarchical decomposition processes defined by an ordering graph.

The second class of experiments was aimed to assess the correctness and feasibility of our approach by exploring the subspace of hierarchical decomposition processes defined by an ordering graph and satisfying the resource constraints.

6.1. Experimental setup

In this subsection we introduce the tools and data sets used in our experiments.

6.1.1. Environment

We have used the 64-bit version of ECLiPSe-CLP [33] 7.0 #44 on an x64-based PC with a 2 cores / 4 threads Intel© Core™i7-5500U CPU at 2.40 GHz running Windows 10. We extracted the timing information using statistics(hr_time,Time) system predicate that determines the value of a high-resolution timer in seconds [33].

We have developed two experimental CLP programs,2

²
The complete ECLiPSe-CLP code and the data sets that we have used in our experiments can be downloaded from http://software.ucv.ro/~cbadica/aicomm2020.zip.

based on our proposed models. The ECLiPSe-CLP implementations that we developed are using the following programming techniques inspired by our previous experimental work with CLP [3 ,5]:

Declarative loops for the implementation of model formulas involving universal quantifiers [29].

Reified constraints that allow to mix integer constraints with Boolean constraints by reifying truth values as 0 and 1.

We have tried both integer arithmetic constraint solver IC that is a native part of ECLiPSe-CLP, as well as Gecode generic constraint solver [31] version 4.4.0 that is incorporated into ECLiPSe-CLP as an external library. Nevertheless, most of the experiments were carried out with Gecode, as we realized it is faster than IC for our models.

Note that we have used both exhaustive, as well as branch-and-bound search, to explore the set of solutions for small scale problems, and to determine suboptimal solutions for larger problems [25].

6.1.2. Data sets

First Data Set. For the first class of experiments we generated a number of random DAGs of increasing size representing ordering constraints, as well as random durations of execution for each activity of the graph. The parameters of each instance of this data set are: number n of graph nodes, number $ng$ of generated graphs, minimum and maximum durations $dmin$ and $dmax$ of each activity, and the density factor $f \in [0, 1]$ of the graph. The higher is this factor the more dense is the graph. Value of f is given as a percentage. We assume that the nodes of each DAG are topologically sorted such that the adjacency matrix is upper triangular. This means that if there is an arc from node i to node j then definitely $i < j$ .

The DAGs were generated for the following values of the parameters: $ng = 4$ , $n \in {4, 5, 6, 7, 8, 20, 50, 100}$ , $dmin = 10$ , $dmax = 30$ , and density factor $f \in {20 %, 30 %, 45 %, 60 %, 70 %}$ . For each test we recorded the total execution time and the values of the metrics of interest. We labelled each DAG data set to reflect the values of its parameters. For example if $n = 10$ , $f = 30 %$ , and the DAG number is 2 then the label is g10-30-2.

Each data set instance was saved into Prolog format with schema presented in Listing 1.

Listing 1.

Scheduling problem given as a set of Prolog facts

Note that these sizes of our data set are significantly smaller than those used in our previous work [2]. This is explained by the fact that here we are using generic constraint-based solvers, that are aimed to work with arbitrary constraint-based models, while in [2] we have used an heuristic algorithm, especially tailored for the problem in hand.

Second Data Set. For the second class of experiments we have applied our prototype CLP program to the j30 data set from [28]. This benchmark set contains 480 resource-constrained project scheduling problems, each problem involving the optimization of the total makespan of projects with 30 activities and 4 renewable constraints.

Each data set was converted from .RCP format into Prolog with schema presented in Listing 2.

Listing 2.

Resource constraint scheduling problem given as a set of Prolog facts

6.2. Results without resource constraints

6.2.1. Correctness experiments

The first set of experiments was focused on the correctness of the model introduced by Proposition 2. The model was manually compiled to ECLiPSe-CLP and was enhanced with pruning rule (23) and bounding rule (26), where l is the level mapping (see equation (20)). $\begin{matrix} (26) & d (P) ⩽ d (P_{HD} (G, ℓ)) \end{matrix}$

These were small scale experiments, for graphs of sizes $4 ⩽ n ⩽ 8$ . Moreover, for the smallest values $n = 4, 5$ we have performed an exhaustive search (queries 1, 2, 3, 4, and 5 in Table 1), while for larger values $n = 6, 7, 8$ we have used branch and bound and incomplete least discrepancy search [12] (queries 6 and 7 in Table 1). Note that we have used IC solver in query 2 and we recorded that Gecode was significantly faster. So we switched to Gecode in all the other queries from Table 1.

Table 1
Queries for the first set of experiments

# Query

1 $search (V, 0, input_order, indomain, complete, [control (gfd_control (_,_, 4))])$

2 $search (V, 0, input_order, indomain, complete, []) %% ic solver$

3 $search (V, 0, input_order, indomain, complete, []) %% gfd solver$

4 $search (V, 0, input_order, indomain, complete, [control (gfd_control (_,_, 2))])$

5 $gfd_search : search (V, 0, input_order, indomain, complete, [])$

6 $search (V, 0, input_order, indomain, bb_\min (CostMin), [timeout (120), control (gfd_control (_,_, 4))])$

7 $gfd_search : search (V, 0, input_order, indomain, lds (6), [])$

8 $search (HPos, 0, input_order, indomain, bb_\min (CostHD 1), [timeout (120), control (gfd_control (_,_, 4))])$

#	Query
1	$search (V, 0, input_order, indomain, complete, [control (gfd_control (_,_, 4))])$
2	$search (V, 0, input_order, indomain, complete, []) %% ic solver$
3	$search (V, 0, input_order, indomain, complete, []) %% gfd solver$
4	$search (V, 0, input_order, indomain, complete, [control (gfd_control (_,_, 2))])$
5	$gfd_search : search (V, 0, input_order, indomain, complete, [])$
6	$search (V, 0, input_order, indomain, bb_\min (CostMin), [timeout (120), control (gfd_control (_,_, 4))])$
7	$gfd_search : search (V, 0, input_order, indomain, lds (6), [])$
8	$search (HPos, 0, input_order, indomain, bb_\min (CostHD 1), [timeout (120), control (gfd_control (_,_, 4))])$

Table 2

Results for graphs of size $n = 4, 5$

Graph	#nodes	#arcs	H	HD	OPT	#solutions	$T_{1}$ [sec.]	$T_{2}$ [sec.]	$T_{3}$ [sec.]	$T_{4}$ [sec.]	$T_{5}$ [sec.]
g4-20-1	4	1	1	34	29	8	0.052	0.164	0.051	0.048	0.065
g4-30-1	4	2	1	57	57	3	0.053	0.174	0.052	0.050	0.062
g4-30-4	4	3	3	70	70	1	0.055	0.150	0.055	0.051	0.053
g4-45-1	4	4	2	83	80	2	0.058	0.159	0.053	0.052	0.055
g4-45-2	4	5	2	65	65	1	0.055	0.191	0.065	0.054	0.056
g4-60-2	4	5	2	62	62	1	0.042	0.173	0.051	0.054	0.056
g4-75-1	4	6	3	83	83	1	0.039	0.025	0.027	0.029	0.027
g5-20-4	5	2	2	55	51	15	0.308	2.645	0.589	0.413	0.668
g5-30-1	5	4	2	76	69	19	0.394	3.701	0.836	0.651	0.888
g5-45-3	5	6	3	81	81	3	0.229	1.698	0.421	0.293	0.397
g5-60-1	5	7	3	93	93	1	0.241	1.604	0.313	0.303	0.363
g5-75-3	5	8	3	81	81	1	0.212	1.613	0.316	0.287	0.370
Time [sec.]							6.534	47.9	11.222	8.781	12.375

Table 2 presents some of the results obtained for graphs with $n = 4, 5$ using queries 1, 2, 3, 4, and 5. The goal of this first part of the experiment was to check the correctness of the model and to select the best solver and its corresponding settings. This table presents the following values: $d (P_{HD} (G, l))$ denoted by HD; the optimal value determined by the search (OPT); the number of solutions obtained; and the running time $T_{i}$ when using query $i = 1, 2, 3, 4, 5$ .

The last row of Table 2 presents the total time consumed by running a given query for the whole set of 19 graphs (note however that results for only 12 of them are shown here). We can observe that the smallest execution time was obtained for query 1. This result was somehow expected as query 1 represents a direct and unique call to Gecode using 4 threads, and our experiments were performed on a machine with 2 cores / 4 threads, so this setting can be considered optimal from this perspective. For example, query 4 uses only 2 threads, while query 5 is compiled into a sequence of calls to Gecode, using a default configuration (i.e. number of threads could not be explicitly set). The largest execution time was recorded by query 2, that used the IC solver.

Table 3

Results for graphs of size $n = 6, 7, 8$

Graph	#nodes	#arcs	H	HD	OPT6	OPT7	HD-OPT	$T_{6}$ [sec.]	$T_{7}$ [sec.]	$T_{8}$ [sec.]
g6-20-2	6	3	1	43	33	-	-	0.864	-	-
g6-30-1	6	2	1	36	33	-	-	2.047	-	-
g6-30-2	6	6	3	74	63	-	-	4.650	-	-
g6-45-1	6	5	3	90	75	-	-	4.448	-	-
g6-60-1	6	7	3	72	67	-	-	5.055	-	-
g6-60-4	6	10	4	113	107	-	-	1.805	-	-
g6-75-3	6	11	5	97	97	-	-	0.159	-	-
g7-20-1	7	6	2	70	70^∗	-	-	-	-	-
g7-20-2	7	1	1	42	36	-	-	15.619	-	-
g7-20-3	7	4	1	54	54	-	-	63.108	-	-
g7-30-1	7	9	3	106	97^∗	-	-	-	-	-
g7-30-2	7	8	2	76	72^∗	-	-	-	-	-
g7-30-3	7	5	2	71	52	-	-	71.372	-	-
g7-45-3	7	10	4	102	96	-	-	73.301	-	-
g7-60-2	7	11	3	105	92	-	-	115.008	-	-
g7-75-3	7	17	5	139	124	-	-	12.282	-	-
g7-75-4	7	12	4	107	99	-	-	76.113	-	-
g8-20-1	8	4	1	58	58^∗	58	58	-	44.883	0.006
g8-20-2	8	4	2	60	57^∗	57	60	-	61.852	0.046
g8-20-3	8	4	1	57	52^∗	56	57	-	47.379	0.012
g8-20-4	8	7	2	54	48^∗	-	54	-	70.524	0.043
g8-30-1	8	4	2	57	40^∗	-	52	-	64.413	0.015
g8-30-3	8	7	4	104	104^∗	104	104	-	73.077	0.015
g8-75-4	8	22	6	145	145	-	145	71.715	43.306	0.017

Table 3 presents some of the results obtained for graphs with $n = 6, 7, 8$ using queries 6, 7, and 8. Note that query 8 is actually part of the second experiment. Query 6 was run for graphs with $n = 6, 7, 8$ , while only queries 7 and 8 were run for graphs with $n = 8$ . The reason is that query 7 fails to obtain solutions for some graphs of size $n = 8$ , thus highlighting the computational limits of our first model (see Proposition 2), while query 8 always determines a (suboptimal) solution, as it is based on the simplified model introduced in Section 5.

Missing values in column $T_{6}$ in Table 3 denote that the timeout of 120 sec set by query 6 was reached, and the search could not be completed. This corresponds to values in column OPT6 marked with ^∗. Columns OPT6, OPT7 and HD-OPT denote the values computed by queries 6, 7, and 8. For example, a missing value in column OPT7 for a graph with $n = 8$ means that query 7 failed to produce a solution for this graph.

6.2.2. Results with hierarchical decomposition subspace

In this section we present experimental results obtained using the hierarchical decomposition model. This model is simpler and it allows to compute suboptimal solutions for significantly larger problems than the complete model. It involved the use of query 8 presented in Table 1. The outcome is represented by the array $HPos$ of decision variables capturing the optimal hierarchical decomposition mapping, actually corresponding to a suboptimal solution to our problem. Nevertheless, in order to have an estimate of how far we are from the optimal solution, we also provide the critical path [13] of the graph that is a lower bound of the optimal solution.

We performed experiments on a set of graphs of larger size $n = 20, 50, 100$ . Table 4 presents a subset of these results. For each value of n we processed 20 graphs. The total processing time for the larger value of $n = 100$ was $T = 1538.73 sec$ . This means that on average, the processing of a single graph took $76.8365 sec$ , i.e. slightly more than one minute.

Table 4
Results for graphs of size $n = 20, 50, 100$

Graph #nodes #arcs H HD CP HD-OPT T [sec.]

g20-20-2 20 38 7 177 140 148 0.055

g20-30-1 20 50 8 237 192 205 0.028

g20-45-2 20 85 11 244 240 240 0.071

g20-60-1 20 101 11 283 253 263 0.067

g20-75-2 20 133 11 266 243 246 0.067

g50-20-4 50 250 13 356 297 325 4.698

g50-30-2 50 362 19 489 439 455 0.191

g50-45-3 50 530 25 611 566 588 0.261

g50-60-2 50 716 28 643 605 628 0.308

g50-75-2 50 926 38 856 829 848 0.411

g100-20-3 100 1016 26 702 606 643 120.685

g100-30-2 100 1445 38 942 821 918 121.289

g100-45-4 100 2234 54 1148 1055 1113 71.264

g100-60-2 100 3023 62 1356 1272 1303 7.383

g100-75-1 100 3724 80 1613 1595 1600 10.947

Graph	#nodes	#arcs	H	HD	CP	HD-OPT	T [sec.]
g20-20-2	20	38	7	177	140	148	0.055
g20-30-1	20	50	8	237	192	205	0.028
g20-45-2	20	85	11	244	240	240	0.071
g20-60-1	20	101	11	283	253	263	0.067
g20-75-2	20	133	11	266	243	246	0.067
g50-20-4	50	250	13	356	297	325	4.698
g50-30-2	50	362	19	489	439	455	0.191
g50-45-3	50	530	25	611	566	588	0.261
g50-60-2	50	716	28	643	605	628	0.308
g50-75-2	50	926	38	856	829	848	0.411
g100-20-3	100	1016	26	702	606	643	120.685
g100-30-2	100	1445	38	942	821	918	121.289
g100-45-4	100	2234	54	1148	1055	1113	71.264
g100-60-2	100	3023	62	1356	1272	1303	7.383
g100-75-1	100	3724	80	1613	1595	1600	10.947

6.3. Results with resource constraints

We have applied our prototype CLP program to the j30 data set from [28]. This benchmark set contains 480 resource-constrained project scheduling problems, each problem involving the optimization of the total makespan of projects with 30 activities and 4 renewable constraints.

The total time for processing our data set was $342.068 sec$ . The searches used a timeout of $120 sec$ . This timeout was exceeded for a single problem, while for 479 problems resulted an average processing time $0.462 sec$ , with minimum $0.286 sec$ and maximum of $11.29 sec$ . Note that for 5 problems the processing time was above $1 sec$ .

Fig. 3.

Experimental results for problems 100–199 of the j30 data set showing problem (X axis) vs project makespan (Y axis).

Taking into account that we did not have the optimal makespan values for our sample data set (PSPLIB contains optimal values for the provided data sets, but only for unstructured optimal schedules), we compared our results with the following measures (see Fig. 3 that presents the results obtained for problems 100–199 of j30 data set):

The critical path of the set of activities [13] denoted with $CP$ .

The costs associated to the hierarchical decomposition processes defined by mappings ℓ and $p^{*}$ denoted with $HD$ and $HDOpt$ .

The subptimal cost obtained using our heuristic approach, denoted with $Cost$ .

The lower and upper bounds of the costs, representing the sum of durations of all the activities ( $Max$ ) and the maximum duration of activities ( $Min$ ).

7. Conclusions and future works

We have developed CLP models for optimization of block structured scheduling processes. The correctness of the models was theoretically assessed. These models were fed to a CLP solver to explore the space of solutions. The results shown that, despite its intellectual charm, CLP is sometimes too slow compared with other approaches, thus hindering larger-scale experiments. Nevertheless, careful identification of interesting subspaces of potential solutions might drastically prune the search, enabling CLP to produce suboptimal solutions of practical value.

We have also extended our formal model of block structured scheduling processes with resource constraints, following ideas from traditional project scheduling literature. We have presented experimental results obtained using a Constraint Logic Programming prototype and an heuristic search algorithm derived by combining hierarchical decomposition of directed acyclic graphs with bin packing. We have used a standard benchmark data set from the project scheduling literature.

As future work we plan to strengthen our results by expanding the experimental evaluation, possibly with different solvers, and using other search strategies. We also envision possibly extending the model to include multiple mode activities, as typically encountered in classical project scheduling.

References

Aler,

Borrajo,

Camacho and

Sierra-Alonso, A knowledge-based approach for business process reengineering, SHAMASH, Knowledge-Based Systems 15(8) (2002), 473–483. doi:10.1016/S0950-7051(02)00032-1.

Bădică,

Dănciulescu and

Logofătu, Greedy heuristics for automatic synthesis of efficient block-structured scheduling processes from declarative specifications, in: Artificial Intelligence Applications and Innovations – Proceedings 14th IFIP WG 12.5 International Conference, AIAI 2018,

L.S.

Iliadis,

Maglogiannis and

V.P.

Plagianakos, eds, IFIP Advances in Information and Communication Technology, Vol. 519, Springer, Cham, 2018, pp. 183–195. doi:10.1007/978-3-319-92007-8_16.

Bădică,

Ivanović and

Doina, A CLP approach for solving the maximum clique problem: Benefits and limits, in: Proceedings 21st International Conference on System Theory, Control and Computing, ICSTCC 2017, IEEE, 2017, pp. 613–617. doi:10.1109/ICSTCC.2017.8107103.

Bădică,

Ivanović and

Logofătu, Exploring the space of block structured scheduling processes using constraint logic programming, in: Intelligent Distributed Computing XIII, Proceedings 13th International Symposium on Intelligent Distributed Computing, IDC 2019,

I.V.

Kotenko,

Bădică,

Desnitsky,

D.E.

Baz and

Ivanović, eds, Studies in Computational Intelligence, Vol. 868, Springer, 2019, pp. 149–159. doi:10.1007/978-3-030-32258-8_17.

Bădică,

Leon and

Luncean, Declarative representation and solution of vehicle routing with pickup and delivery problem, in: Proceedings International Conference on Computational Science, ICCS 2017,

Koumoutsakos,

Lees,

V.V.

Krzhizhanovskaya,

J.J.

Dongarra and

P.M.A.

Sloot, eds, Procedia Computer Science, Vol. 108, Elsevier, 2017, pp. 958–967. doi:10.1016/j.procs.2017.05.261.

Bădică,

Logofătu,

Buligiu and

Ciora, Modeling block structured project scheduling with resource constraints, in: Large-Scale Scientific Computing – Proceedings 12th International Conference, LSSC 2019,

Lirkov and

Margenov, eds, Lecture Notes in Computer Science, Vol. 11958, Springer, 2019, pp. 484–492. doi:10.1007/978-3-030-41032-2_55.

Baral and

Uyan, Declarative specification and solution of combinatorial auctions using logic programming, in: Logic Programming and Nonmotonic Reasoning: 6th International Conference, LPNMR 2001 Vienna, Austria, September 17–19, 2001 Proceedings,

Eiter,

Faber and

M.l.

Truszczyński, eds, Lecture Notes in Computer Science, Vol. 2173, Springer Berlin Heidelberg, Berlin, Heidelberg, 2001, pp. 186–199. doi:10.1007/3-540-45402-0_14.

Charwat and

Pfandler, Democratix: A declarative approach to winner determination, in: Algorithmic Decision Theory: 4th International Conference, ADT 2015, Lexington, KY, USA, September 27–30, 2015, Proceedings,

Walsh, ed., Lecture Notes in Computer Science, Vol. 9346, Springer International Publishing, 2015, pp. 253–269. doi:10.1007/978-3-319-23114-3_16.

Dumas,

La Rosa,

Mendling and

H.A.

Reijers, Fundamentals of Business Process Management, 2nd edn, Springer, 2018. doi:10.1007/978-3-662-56509-4.

10.

H.M.

Ferreira and

D.R.

Ferreira, An integrated life cycle for workflow management based on learning and planning, Int. J. Cooperative Inf. Syst. 15(4) (2006), 485–505. doi:10.1142/S0218843006001463.

11.

M.R.

Garey and

D.S.

Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W. H. Freeman and Company, 1979.

12.

W.D.

Harvey and

M.L.

Ginsberg, Limited discrepancy search, in: Proceedings of the 14th International Joint Conference on Artificial Intelligence – Volume 1, IJCAI’95, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1995, pp. 607–613, http://ijcai.org/Proceedings/95-1/Papers/080.pdf .

13.

J.E.

Kelley, Critical-path planning and scheduling: Mathematical basis, Operations Research 9(3) (1961), 296–320. doi:10.1287/opre.9.3.296.

14.

Kolisch and

Sprecher, PSPLIB – a project scheduling library, European Journal of Operational Research 96(1) (1997), 205–216. doi:10.1016/S0377-2217(96)00170-1.

15.

S.J.J.

Leemans,

Fahland and

W.M.P.

van der Aalst, Discovering block-structured process models from event logs – a constructive approach, in: Application and Theory of Petri Nets and Concurrency – Proceedings 34th International Conference, PETRI NETS 2013,

J.M.

Colom and

Desel, eds, Lecture Notes in Computer Science, Vol. 7927, Springer, 2013, pp. 311–329. doi:10.1007/978-3-642-38697-8_17.

16.

Lierler, What is answer set programming to propositional satisfiability, Constraints 22(3) (2017), 307–337. doi:10.1007/s10601-016-9257-7.

17.

Malik and

Zhang, Boolean satisfiability: From theoretical hardness to practical success, Communications of the ACM 52(8) (2009), 76–82. doi:10.1145/1536616.1536637.

18.

Marrella, Automated planning for business process management, J. Data Semantics 8(2) (2019), 79–98. doi:10.1007/s13740-018-0096-0.

19.

Marrella and

Lespérance, A planning approach to the automated synthesis of template-based process models, Service Oriented Computing and Applications 11(4) (2017), 367–392. doi:10.1007/s11761-017-0215-z.

20.

Matejaš and

Fertalj, Building a BPM application in an SOA-based legacy environment, Computer Science and Information Systems 16(1) (2019), 45–74. doi:10.2298/CSIS171005010M.

21.

Mili,

Tremblay,

G.B.

Jaoude,

Lefebvre,

Elabed and

El-Boussaidi, Business process modeling languages: Sorting through the alphabet soup, ACM Computing Surveys 43(1) (2010), 4:1–4:56. doi:10.1145/1824795.1824799.

22.

Mrasek, Automatic synthesis and verification of industrial commissioning processes, PhD thesis, Fakultät für Informatik, Karlsruher Instituts für Technologie, 2016.

23.

Mrasek,

J.A.

Mülle and

Böhm, Process synthesis with sequential and parallel constraints, in: On the Move to Meaningful Internet Systems: Proceedings OTM 2016 Conferences – Confederated International Conferences: CoopIS, C&TC, and ODBASE 2016,

Debruyne,

Panetto,

Meersman,

T.S.

Dillon,

Kühn,

O’Sullivan and

C.A.

Ardagna, eds, Lecture Notes in Computer Science, Vol. 10033, 2016, pp. 43–60. doi:10.1007/978-3-319-48472-3_3.

24.

Nalepa and

Ligęza, The HeKatE methodology. Hybrid engineering of intelligent systems, International Journal of Applied Mathematics and Computer Science 20(1) (2010), 35–53. doi:10.2478/v10006-010-0003-9.

25.

Niederliński, A Gentle Guide to Constraint Logic Programming via ECLiPSe, 3rd edn, Jacek Skalmierski Computer Studio, Gliwice, 2014.

26.

Pellerin,

Perrier and

Berthaut, A survey of hybrid metaheuristics for the resource-constrained project scheduling problem, European Journal of Operational Research 280(2) (2020), 395–416. doi:10.1016/j.ejor.2019.01.063.

27.

Pesic and

W.M.P.

van der Aalst, A declarative approach for flexible business processes management, in: Business Process Management Workshops, Proceedings BPM 2006 International Workshops, BPD, BPI, ENEI, GPWW, DPM, Semantics4ws,

Eder and

Dustdar, eds, Lecture Notes in Computer Science, Vol. 4103, Springer, 2006, pp. 169–180. doi:10.1007/11837862_18.

28.

Project Scheduling Problem Library – PSPLIB, http://www.om-db.wi.tum.de/psplib/.

29.

Schimpf, Logical loops, in: Logic Programming, 18th International Conference, ICLP 2002, Copenhagen, Denmark, July 29–August 1, 2002, Proceedings,

P.J.

Stuckey, ed., Lecture Notes in Computer Science, Vol. 2401, Springer, 2002, pp. 224–238. doi:10.1007/3-540-45619-8_16.

30.

Schimpf and

Shen, ECLⁱPS^e – from LP to CLP, Theory and Practice of Logic Programming 12(1–2) (2012), 127–156. doi:10.1017/S1471068411000469.

31.

Schulte,

Tack and

M.Z.

Lagerkvist, Modeling and programming with Gecode, 2019, Available from http://www.gecode.org.

32.

Sindelar,

R.K.

Sitaraman and

Shenoy, Sharing-aware algorithms for virtual machine colocation, in: Proceedings of the Twenty-Third Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA’11, Association for Computing Machinery, New York, NY, USA, 2011, pp. 367–378. doi:10.1145/1989493.1989554.

33.

The ECLiPSe constraint programming system, available from http://eclipseclp.org/.

34.

J.D.

Ullman, NP-complete scheduling problems, Journal of Computer and System Sciences 10(3) (1975), 384–393. doi:10.1016/S0022-0000(75)80008-0.

35.

Zarandi,

Hossein,

A.A.

Sadat Asl,

Sotudian and

Castillo, A state of the art review of intelligent scheduling, Artificial Intelligence Review 53(1) (2020), 501–593. doi:10.1007/s10462-018-9667-6.

Block structured scheduling using constraint logic programming

Abstract

Keywords

1. Introduction

2. Related works

3. Preliminaries

3.1. Process trees

1 Σ ∗ is the set of all sequences consisting of zero or more elements of Σ.

3.4. Optimal process

4. Relational models of optimal process trees

4.1. Relational representation of process trees

5. Hierarchical decomposition processes

5.1. Subspace of hierarchical decomposition processes

5.2. Further pruning rule

Proposition 4 (Maximum Number of Parallel Composition Operators).

5.3. Relational representation of hierarchical decomposition processes

6. Computational experiments

6.1. Experimental setup

6.1.1. Environment

2 The complete ECLiPSe-CLP code and the data sets that we have used in our experiments can be downloaded from http://software.ucv.ro/~cbadica/aicomm2020.zip.

6.2.1. Correctness experiments

References

¹
$Σ^{*}$ is the set of all sequences consisting of zero or more elements of Σ.

²
The complete ECLiPSe-CLP code and the data sets that we have used in our experiments can be downloaded from http://software.ucv.ro/~cbadica/aicomm2020.zip.