Abstract
Taking inspiration from both Constraint Programming (CP) and Logic Programming (LP), the
Keywords
Introduction
Timeline-based planning is an approach to temporal planning introduced in [29]. It consists of synthesizing temporal functions that describe the variation over time of a set of domain dynamic features to achieve temporally extended goals (see also [35] for an introduction and [11] for a recent formalization). Most of the current timeline-based planners, like
Mostly based on the notion of partial order planning [38], timeline-based planners have usually neglected advantages from classical planning triggered from the use of both
To directly cope with such pitfalls, we have developed a new framework, called
This work is a consolidated version of a subsequent work [17] where some of the “removed” constraints (in particular, the disjunctions) have been reintroduced in the heuristic, thus enriching its informativeness and allowing improved performances of the resolution algorithm, particularly on those domains in which performances were worse.
The current paper follows the path paved by those previous works [16, 17]. It describes in more detail the composing elements of
The paper is organized as follows. Section 2 introduces the basic principles underlying the
I LO C: An integrated logic and constraint reasoner
In building our planner we have first created a reasoner based on first-order logic and constraint programming, then we have endowed such reasoner with some domain independent heuristics to improve the performance of the resolution process and, as a third step, we have introduced features to create a timeline-based planner. It is worth observing that the resulting planner is quite different with respect to the pre-existing purely constraint-based implementations – see later discussion in Section 7.
The basic core of the
Once objects are defined,
In addition, it is possible to impose constraints on existentially quantified variables (e.g., ∃l ∈ Locations : l.x ≥ 10) as well as universally quantified variables (e.g., ∀l ∈ Locations : l.x ≤ 100). By combining logical quantifier and object oriented features,
A rather straightforward method for managing this kind of problems is to translate them into a Satisfiability Modulo Theories (SMT) problem (see, for example, [33]). SMT is the problem of deciding the satisfiability of a formula with respect to some theory. In particular, the theories we are interested in are those of Linear Arithmetic (LA) and Nonlinear Arithmetic (NLA) over both Reals and Integers. Since SMT checks the satisfiability of a formula over a theory, SMT solvers support natively reified constraints, therefore we can easily express any combination of numerical constraints. Furthermore, SMT solvers are generally endowed of incremental features, providing support for the addition and the retraction of variables and constraints through a stack. Such features, we will see soon, are essential for allowing our solver to backtrack and thus for guaranteeing a complete search during the resolution phase. There are several available SMT solvers having different performances and capabilities (e.g., linear vs non-linear mathematical constraints). Since
Rules and requirements
Although this basic core allows the definition of quite complex problems, some of the problems we are interested are excluded from the possibility of being modeled with this formalism. In order to overcome these limitations, we need something more powerful. Something that, roughly speaking, is able to “decide” the number of involved variables, together with their value. For this purpose, we have chosen to extend the above formalism by allowing many-sorted first-order Horn clauses 3 , i.e., clauses with at most one positive literal, called the head of the clause, and any number of negative literals, forming the body of the clause. For example, we could use a predicate such as FirstQuadrant, with a Location l argument, within the clause FirstQuadrant (Location l) ⇐ 〚l.x ≥ 0〛 ∧ 〚l.y ≥ 0〛, for describing locations in the first quadrant of a Cartesian coordinate system. Furthermore, we do not allow constraints in the head of a clause but we slightly relax the “positive” literals in the body by allowing constraints to appear in any logical combination (i.e., we could rewrite the above example as FirstQuadrant (Location l) ⇐ ¬ 〚l.x < 0〛 ∧ ¬ 〚l.y < 0〛). We exploit this implication-based syntax by analogy with logic programming. By exploiting De Morgan’s laws 4 , it should not be a problem, for the reader, to recognize the equivalence with the clausal form.
A consequence of what we have seen is that
The distinction between rules and requirements allows us to make a comparison with classical planning theory. Specifically, the set of rules represents what, in classical planning, is typically called domain theory (or, interchangeably, domain model) while the set of requirements represents what, in classical planning, is typically called a problem. Furthermore, in common classical planning languages, it is customary to define types within domain models and their instances (i.e., the objects) within planning problems. A detailed description of the input language is out of the scope of this document. However, it is worth noting that a complete problem is described by (i) the definition of the types, (ii) the rules, (iii) the type instances and (iv) the requirements. As regards the solution, this is completely described by a set of requirements in which any call to a predicate is a fact (i.e., there are no goals in a solution).
It is worth highlighting that, contrary to what happens in common logic programming, we do allow variables in fact atoms. This allows us to easily model pure scheduling problems and, furthermore, to easily model classical planning causality (we will see an example later on). However, a consequence of this is that, unlike logic programming, we cannot use capitalization to distinguish between facts and goals. We need, therefore, to explicitly annotate atoms in order to distinguish them (i.e., goal : FirstQuadrant (Location l) is a goal while fact : FirstQuadrant (Location l) is a fact). Finally, we consider atomic formulas as objects (and predicates as types) within our object oriented environment and, as such, they have an identifier which can be used, by means of the dot notation, to address their arguments and to use them within constraints. Therefore, given a predicate as At (Location l) representing, for example, the position of a robot, we might define a problem as, for example, fact:at0 : At (), goal:at1 : At () and at0 . l ! = at1 . l (notice the identifiers at0 and at1 for the atoms and their use within constraints). Each time a new atomic formula object is created, a new existentially quantified variable is added for each of the arguments of its predicate. For example, when creating the fact fact:at0 : At (), a new object variable, whose allowed values are all the instances of Location, is created and associated to the fact at0.
The resolution algorithm
From an operational point of view,
Figure 1 shows a general description of the
Initially, the working memory contains all the requirements. For each goal in the agenda, in general, a branch in the search space is created. Resolution, at first, will try to unify goals with compatible facts already present in the working memory, if any, creating a single branch for all the eligible candidates. Specifically, given the existing facts , having the same predicate of the goal, the formula (we will say, in short, p g ≡ p1lor … lorp j ) is added to the working memory and the goal is removed from the agenda. In addition, a branch is also created for each of the rules whose head unifies with the chosen goal and, whenever such a branch is chosen by the resolution algorithm, the body of the corresponding rule is added to the working memory possibly generating further facts, goals and constraints to be managed. Also in this case, since the chosen goal has been solved, it is removed from the agenda however, contrary to the unification case, this goal is now considered as a new fact within the working memory.
The final aim of the resolution process is, therefore, to remove all the goals from the agenda. At the end of the resolution process, from a logical point of view, all the goals might be seen as a consequence of the application of a rule. Intuitively, the purpose of unification is to avoid considering those goals whose (any) rule has already been applied. Needless to say that all the constraints among all the objects of the working memory must be satisfied.
Summarizing, the basic operations for refining the working memory π toward a final solution are the following: Find all the (sub)goals of π (i.e., the agenda). Select one such (sub)goals. Find ways to resolve it (i.e., find all the possible unifications and all the rules which might be applied). Choose a resolver for the (sub)goal (i.e., either unify with an existing fact or choose a compatible rule). Refine π according to the chosen resolver.
The process follows an A* search strategy that aims at minimizing the number of goals in the agenda, proceeding until there are no more goals into the agenda and while all the constraints in the working memory are consistent. As briefly mentioned earlier, whenever the constraints become inconsistent the system performs a backtracking step.
Figure 2 shows how this process is applied to the simple problem depicted. We have shortened goal in g for space reasons and used a color schema for distinguishing facts from goals (i.e., orange circles represent goals and green circles represent facts). Furthermore, we have used different variable names to further highlight the differences between the rules space and the working memory. At the beginning, the two goals a0 and b0 need to be resolved, so, for example, the former is chosen. Since unification with already existing facts is not possible (there is no fact yet in the current working memory), the first rule is applied, resulting in the addition of sub-goal goal:b : B (y) and constraint 〚x < b.x〛. At this point, goal b0 is chosen. Again, unification is not possible so the second rule is applied, resulting in the addition of constraint 〚x < 10〛. Last goal to be solved is subgoal b. Since unification is now possible, we create a branch in the search space creating two nodes (i.e., Node 1 and Node 2) and in the former we unify formulas b and b0 (by adding the constraint 〚v = y〛) while in the latter we apply the second rule a second time. It is worth noting that by selecting b0 at the first step, although with different paths, we would have got the same two final nodes. In addition, since the first two steps do not require choices, there is no need for creating a branching and the resolution can continue “inside” Node 0. Finally, both the resulting nodes represent a solution for our starting problem (i.e., no goals in the agenda and consistent constraints) and thus the resolution process can terminate. Clearly, the presence of additional (sub)goals would have required further unifications and/or rules applications.
The ALL REACHABLE heuristic
Since all the goals must be solved sooner or later, there is almost no difference among which goal is solved first. Selecting the “right” goal, however, impacts heavily with the efficiency of the resolution algorithm. In order to overcome this obstacle we can take advantage of some heuristics. It is worth to highlight the fact that, in general, for solving our problem, two heuristics are needed. Specifically, we need an heuristic for selecting the right goal to be solved in the current node of the search space and an heuristic for selecting the next node of the search space from the fringe. This is similar to what happens in Constraint Satisfaction Problems (CSP) in which a variable selection heuristic, for selecting the next variable, is flanked by a value selection heuristic, for assigning a value to the selected variable. Given the similarity with CSPs, we also refer to our approach with the term Meta-CSP.
As a first attempt (see [16] for further details) in producing our heuristics we have created a data structure, called static causal graph (since it does not change during the resolution process), aimed at producing some kind of information which might be used to guide the search process. Our graph has a node for each of the predicates that appear in our rules and, for every rule, an edge from the head of the rule to each of the predicates that appear in the body of the same rule. The main idea was to create a simple data structure (keeping
Initially, we considered the estimated cost for solving a goal (i.e., the estimated cost of our equivalent of variable selection heuristic) as the number of reachable nodes from the node relative to the predicate associated to the goal. We called such a strategy “All Reachable - Goal Selection Heuristic” (A
At first, we did not consider at all the possible disjunctions (resulting from rules having the same head) of the domain theory, nor the arguments of the predicates nor the constraints among the objects. In order to mitigate this oversight, we selected nodes from the fringe of the search space according to another heuristic (i.e., the estimated cost of our equivalent of value selection heuristic). By considering the resolution process as an optimization problem, the first choice was to choose the node having the least unresolved (sub)goals. Better than nothing, such a strategy did not appear smart enough since it completely ignored the kind of involved (sub)goals. A slightly better strategy consists in reusing the idea behind A
To sum up, when choosing the next goal to be solved, we choose the goal having the minimum estimated cost, when choosing the next node to be solved, we choose the node having the minimum estimated distance from the solution. The fact that the A
The MIN REACH heuristic
A slight improvement to our heuristic is constituted by the addition of disjunctions into the static causal graph by means of two special nodes representing conjunctions (AND nodes) and disjunctions (OR nodes). Figure 4 shows the improved static causal graph generated from the example in Fig. 2. The cost for solving a goal is now evaluated as the minimum number of reachable nodes starting from the node associated to the goal predicate. The general idea here was the following: whenever the resolution algorithm finds a disjunction, the application of the rule that would lead to the minimum number (notice that we are persevering in the above mistake!) of formulas should be chosen. We call such a strategy M
One might argue that by introducing disjunctions into the static causal graph we increase the complexity of the evaluation from polynomial to exponential. However, just as for the A a boolean variable is associated to each predicate and to each AND node; for each arc 〈s, t〉, going from source node with boolean variable s to target node with boolean variable t, a clause (¬ s, t) is added; for each arc 〈s, OR〉, going from source node with boolean variable s to an OR target node, we consider the variables b1, …, b
n
associated to all the n nodes directly reachable from the OR node and a clause (¬ s, b0, …, b
n
) is added.
Each predicate can now be evaluated as follows: we assume a unit clause containing the variable associated to the predicate we want to evaluate, solve the resulting MIN-ONE SAT problem, count the number of positive literals associated to predicates and subtract 1, since we don’t count the starting node. As an example, the resulting MIN-ONE SAT problem associated to predicate G of Fig. 4 is the following (we use lowercase names for the associated boolean variables):
Similar to what we did for the A
Timeline-based planning and I LO C
As we will see in this section, the logic-based formalism introduced in the previous sections is powerful enough to represent timelines. Furthermore, the proposed heuristics are not strictly limited to the timeline-based planning problems. The search space of a timeline-based planner, however, has typically partially specified plans as nodes and plan refinement operations as arcs. A partially specified plan maintains causal and temporal relations among timely scoped formulas (usually called tokens) and, therefore, is really similar to the working memory of the
A possible approach to the modeling of timeline-based planning problems is to endow the predicates described in the previous sections with numerical arguments in order to represent timely scoped atomic formulas
5
. Such atoms represent the assertion of a certain fact within a given time interval. For example, the formula At (r, l, s, e, d) might be used for representing the presence of a robot r in the location l from time s (for start) to time e (for end) for a duration d. This expedient, however, is not enough to detect all the inconsistencies. We need some specialized algorithms which understand the proper semantic of such numerical arguments recognizing them as temporal information. Unfortunately, such algorithms are dependent on the specific kind of timeline. One possibility which we have pursued is to exploit the complex types of the
Unlike most of the timeline-based planners which consider timelines as a sort of “containers” for the atomic formulas, we simply add a parameter, having the same type as the timeline, to the predicates and call such a parameter scope. The type of our scope variables will be a “distinguisher” for triggering further reasoning required by the specific timeline. Furthermore, the resulting scope variables are, to all effects, variables and, therefore, could be subject to constraints. In the following we will describe some of these timelines, commonly used in timeline-based planning, providing some intuition of their specialized algorithms.
State variables They are used to describe the “state” of a dynamical system as, for example, the position of a specific object at a given time or a simple manufacturing tool that might be operating or not. The semantics of a state variable (and thus the implicit constraints we need to make explicit) is simply that, for each time instant , the timeline can assume only one value. Figure 5(a) represents an example of state variable with three atomic formulas (parameter types are omitted for sake of space). The example shows a robot r0, a state variable of type Robot, which might be At a given location or might be Going to another location. We thus have the two predicates At (sc, l, s, e, d) and Going (sc, l, s, e, d) each having a parameter sc of type Robot describing the scope of the formulas and parameters l, s, e and d respectively for the location, the start, the end and the duration. The planner will take care of adding the proper constraints for avoiding the temporal overlapping of the incompatible states (i.e., all the formulas which have the same scope and do not unify) or for “moving” the states on other instances of type Robot (i.e., choosing another value, for example r1, for the scope of the formula).
Resources They are entities characterized by a resource level, representing the amount of available resource at any given time, and by a resource capacity, representing the physical limit of the available resource. We can identify several types of resources depending on how the resource level can be increased or decreased in time. A consumable resource is a resource whose level is increased or decreased by some activities in the system. An example of consumable resource is a reservoir which is produced when a plan activity “fills” it (i.e., a tank refueling task) as well as consumed if a plan activity “empties” it (i.e., driving a car uses gas). Consumable resources have two predefined rules, each having an empty body, and a predicate Produce (sc, id, a, s, e, d) (Consume (sc, id, a, s, e, d)) as head, so as to represent a resource production (consumption) on the consumable resource sc of amount a from time s to time e with duration d (we use an id parameter to prevent unification among these formulas). In addition, the consumable resource complex type has four member variables representing the initial and the final amount of the resource, the min and the max value for the resource level. Quite popular in the scheduling literature, reusable resources are similar to consumable resources where productions and consumptions go in tandem at the start and at the end of the activities. Reusable resources can be used for modelling, for example, the number of programmers employed on a given project for a given time interval. Reusable resources have one predefined rule having an empty body and a predicate Use (sc, id, a, s, e, d) as head so as to represent an instantaneous production of resource sc of amount a at time s and an instantaneous consumption of the same resource sc of the same amount a at time e. In addition, the reusable resource type has a member variable for representing the capacity of the resource. Figure 5(b) and (c) represent, respectively, an example of consumable resource and an example of reusable resource with some associated formulas.
Enhancements to the reasoner By introducing these complex types, we require the reasoner to add further constraints so as to avoid object inconsistencies (e.g., different states overlapping for some state variable; resource levels exceeding resource capacity or going lower than min, etc.). We chose to refine our resolution process by introducing a step for detecting such inconsistencies and for adding required constraints which would remove them. The resulting basic operations for refining the working memory π toward a final solution are thus the following: Find the (sub)goals of π. Select one such (sub)goals. Find ways to resolve it. Choose a resolver for the (sub)goals. Refine π according to that resolver. Check for any object inconsistency and remove it.
Similar to [8], we use a lazy approach for detecting inconsistencies. Namely, we let the underlying SMT solver to extract a solution given the current constraints and, in case some inconsistency is detected we add further constraints so as to remove the inconsistency. A simple example should clarify the idea. Let us suppose in a given working memory there are two formulas describing a state variable sv k having two overlapping states s i and s j , we solve the inconsistency by adding the constraint 〚s i . start ≥ s j . end〛 lor 〚s j . start ≥ s i . end〛 lor 〚s i . scope ≠ s j . scope〛 preventing further overlapping of these states on the same state variable. The core idea for solving resource inconsistencies follows a very similar schema.
Planner evaluation
To assess the value of our heuristics, we have endowed
The problems We start the comparison by solving the Blocks World domain, a workhorse for the planning community. As known, in this domain a set of cubes (blocks) are initially placed on a table. The goal is to build one or more vertical stacks of blocks. The catch is that only one block may be moved at a time: it may either be placed on the table or placed atop another block. Because of this, any blocks that are, at a given time, under another block cannot be moved. We used the 4-operator version of the classic Blocks World domain, as found on the International Planning Competition (IPC) website 6 , as a starting point. Furthermore, we use a simpler variant of the general problem (usually called Tower) in which, at the initial state, all the blocks are on the table and whose goal is to stack all the blocks on a single tower. Specifically, for each block, we defined a state variable for representing what is on top of the block (i.e., either another block or the value “Clear”) and a state variable for representing if the block is on the table or not. An additional state variable has been defined for modeling the robotic arm supporting values that represent either the arm holding a block or the value “Empty”. Finally, we defined an “Agent” complex type for modeling the agents’ actions. Rules have been defined so as to have an atomic formula for each effect of the PDDL actions as head and an atomic formula for the actions as body (modeled as a subgoal), aside from rules having an atomic formula for each PDDL action as head and an atomic formula for their preconditions (modeled as subgoals) and effects (modeled as facts) as body. By modeling the effects of the actions through facts we avoid the introduction of further subgoals into the agenda when applying the rule associated to the action. Temporal constraints have been conveniently added for guaranteeing that preconditions precede actions and effects follow actions.
We have also checked our system with two other problems, namely the Temporal Machine Shop [15] and the Cooking Carbonara domain [27]. Both these problems are temporally expressive (see [14]) since they require concurrency for being solved. Domain models and instances have been taken from the TLP-GP planner website 7 .
The Temporal Machine Shop problem is the only temporally expressive problem of the IPC and, up to now, in the various editions of the competition, it is solved by the sole ITSAT planner (see [31]). The problem models a baking ceramic domain in which ceramics can be baked while a kiln is firing. Different ceramic types require a different baking time. While a kiln can fire for at most 20 minutes at a time (and then it must be made ready again), baking a ceramic takes, in general, less time, therefore we can save costs by baking them altogether. Additionally, similar to [31], we have slightly complicated the domain by considering the possibility for ceramics to be assembled, so as to produce different structures which should be baked again to obtain the final product. Specifically, for each kiln we defined a state variable for distinguishing either the kiln is “Ready” or “on Fire”. In addition, each kiln has associated a reusable resource for representing its capacity. For each ceramic piece we defined a state variable for representing either the piece is “Baking” (with an additional parameter for representing the kiln in which is baking), or the piece is “Baked”, or the piece is “Treating”, or the piece is “Treated”. Similarly, for each ceramic structure we defined a state variable for representing either the structure is “Assembling”, or the structure is “Assembled”, or the structure is “Baking” (with an additional parameter for representing the kiln in which it is baking), or the structure is “Baked”. Rules force these values to appear in time, in each state variable, in the intuitive manner (i.e., in the order in which these values have just been introduced). The interesting aspect, however, is that ceramic structures can bake concurrently with ceramic pieces both while (hence the temporal expressiveness) the kiln is firing.
The Cooking Carbonara domain represents another temporally expressive problem in which the aim is the preparation of a meal, as well as its consumption by respecting constraints of warmth. Problems cooking-carbonara-n allow to plan the preparation of n dishes of pasta. The concurrency of actions is required to obtain the goal because it is necessary that the electrical plates work in a way that water and oil are hot enough to cook pasta and bacon cubes. It is also necessary to perform this baking in parallel to serve a dish that is still hot during its consumption. Specifically, for each plate we defined a reusable resource for representing its (unary) capacity. For each pot we defined a state variable for distinguishing either the pot is “Boiling” (with an additional parameter for representing the plate on which is boiling) or the pot is “Hot”. For each pan we defined a state variable for distinguishing either the pan is “Boiling” (with an additional parameter for representing the plate on which is boiling) or the pan is “Hot”. Each portion of spaghetti has associated a state variable for distinguishing either the portion is “Cooking” (with an additional parameter for representing the pot in which is cooking) or the portion has been “Cooked”. For each bacon portion we defined a state variable for distinguishing either the bacon is “Cooking” (with an additional parameter for representing the pan in which is cooking) or the bacon has been “Cooked”. Each egg has associated a state variable for distinguishing either the egg is “Being beaten” or the egg has been “Beaten”. Finally, for each carbonara portion we defined a state variable for distinguishing either the portion is “Cooking” (with an additional parameter for representing the plate on which should be cooked), or the portion has been “Cooked”, or someone is “Eating” the portion or the portion has been “Eaten”. Again, rules force values to appear in time, in each state variable, in the intuitive manner (i.e., in the order in which these values have just been introduced). Furthermore, carbonara portions should be cooking after spaghetti, bacon and eggs have been correctly prepared, hence requiring spaghetti to be “Cooking” while the water in pots is “Hot” as well as bacon to be “Cooking” while the oil in pans is “Hot”. Finally, cooking carbonara portions, boiling water in pots and oil in pans should be performed while plates are available.
Results and discussion Starting from the blocks world (the problem where causal reasoning is paramount), we can see, Fig. 6, that despite the introduction of our heuristics, planners endowed with “classical heuristics” still perform significantly better than our approach. Thanks to the M
Experimental results on the other domains (Figs. 7 and 8) show that
A separate discussion it is worth doing concerns the expressiveness of
Managing execution uncertainty We now address a different perspective for evaluating
Consider, for example, a simple logistic company having a fleet of two trucks (see Fig. 9). The two tasks t0 and t1 are scheduled for Truck 1 respectively from 10:00 to 12:00 and from 13:30 to 15:30. Although task t0 can be slightly postponed, we have a further constraint asserting it must end strictly before 17:00. However, in order to avoid penalties with its customer, task t1 must be executed exactly at the scheduled time. Now suppose that at 9:55 our logistic company discovers that Truck 1 has some problem at the engine. Since estimated repair time is one hour, we could temporally adapt the plan for starting t0 at 11:00 (we achieve this by adding the constraint 〚t0 . start > =11〛). This kind of temporal adaptation is quite common for any planning and execution environment (see, for example, [30] and [6]) up to be considered mandatory. Now imagine at 10:55 we discover that repair time was underestimated and a further hour is required. By moving t0’s start time at 12:00 we would interfere with task t1, scheduled for time 13:30. Clearly, given these constraints, a replanning phase would assign task t0 to Truck 2. However, by exploiting the scope parameter introduced earlier, we can adapt the plan by reassigning task t0 and thus avoiding a possibly time consuming replanning phase. We recall that the scope parameter is indeed a variable, and therefore we can let the reasoner choose a value for it according to involved constraints. In other words, similar to what happens for temporal adaptation, we can perform also more complex forms of adaptation, including this kind of “causal” adaptation. It could be argued that such an adaptation could be expensive as well and, indeed, it is exponential in the worst case. Nevertheless, the underlying constraint solver allows efficient constraint propagation, making the adaptation process negligible, with respect to a complete replanning process, on the simple problems that we have used for testing this kind of flexibility.
About timeline-based planning heuristics
Despite some performance issues on some specific domains, thanks to the expressive representation capabilities of the
Similarly to
The performance aspect, however, remains still open. Despite the formalization would remain valid, some of our latest studies are gradually leading us to reconsider the concepts at the core of the framework. The main learned lesson, indeed, is that reasoning about the sole elements within the current partial plan might not be sufficient for producing adequately informative heuristics. Although in a limited way,
Reasoning about the sole elements within the current partial plan also appears in other similar planners including A
Some
In the next section we will discuss some of the reasons for the blindness of the heuristics. We will then try to provide some guidelines which might lead, in future works, to bridge the performance gap between classical and timeline-based planning.
Directions for further work
Finding the right balance between the informativeness and the computational complexity of an heuristic is not an easy task. In any case the current level of informativeness is not satisfactory and should be increased. A first refinement step for the future might be to consider the initial state of the working memory. Consider again the graph in Fig. 4. Now suppose two facts F () and H () are initially present in the working memory. We might imagine that eligible subgoals, coming from the application of the rules, will eventually unify with the above facts. We might consider such information for enhancing the informativeness of the heuristic by pruning, in general, the causal graph. In such a circumstance, indeed, the cost for solving a G () goal is now further reduced from 1 to 0. Indeed, despite they are two, we might consider the atoms F () and H () preferable, since they will be unified, compared to the sole formula I ().
Indeed, it is possible to do more. Consider again the graph in Fig. 4. The two edges incoming into node F represent the unification of two atoms having predicate F. Now suppose predicate F has some argument as, for example, an integer i. Suppose, also, that the body of the rule, having E as head, contains a constraint 〚i > 0〛, while the body of the rule, having G as head, contains, in addition to F, a constraint 〚i < 0〛. Clearly, the two atoms will never unify. Now, despite the initial state, the preferable choice for the disjunction associated to predicate G () goes back to being the sole formula I (). Furthermore, similar constraints might also be present in the initial state of the working memory.
Neglecting this information seems a valid reason for not scaling well, especially on causally biased domains. It is less clear how to consider this information within the heuristics and, therefore, how to overcome our performance issues. If we want to consider more information within our heuristic, a possible path to follow might be to replace predicates directly with atoms in our causal graph. Unfortunately, this transition is not as straightforward as it might seem. We will provide, in the next sections, some insights on how it can be achieved.
The constrained causal graph
In order to replace predicates with atoms, we should create a data structure which (i) does not propagate constraints in case of disjunctions (we recall that such disjunctions represent branches in the search tree), yet (ii) powerful enough to propagate constraints that allow us to recognize ineligible unifications.
The first of our problems can be easily solved through simple arc-consistency techniques (see, for example, [19]). Roughly speaking, a variable of a Constraint Satisfaction Problem (CSP) is said to be arc-consistent with another one if each of its admissible values is consistent with some admissible value of the second variable. Specifically, a variable x i is arc-consistent with another variable x j if, for every value a in the domain of x i there exists a value b in the domain of x j such that (a, b) satisfies any binary constraint between x i and x j . We might use bound consistency to represent numeric variables domains and, whenever possible (i.e., in case of linear constraints), we will apply incremental Gauss-Jordan elimination [28]. Now, what happens if constraints are not binary? Arc-consistency is not powerful enough to propagate such constraints. However, since we do not want too much propagation, what might seem a limitation of this local consistency technique turns out to be, actually, a strength.
Concerning the second issue, similarly to the InPlan variables of CPT [37], for each atomic formula p, we might create an enumerative variable state (p)∈ { Inactive, Active, Unified } indicating the state of the atom p. Specifically, an atom p might be inactive, in which case it is not considered within the current working memory, might be active, in which case it is considered within the current working memory, or might be unified with an already active formula. The trick is to constraint these variables among themselves, together with the reified constraints coming from rules, in order to represent, in a limited way, the causality of the different plans emerging both from the rules and from the initial working memory. To check whether a formula p is not unifiable with a fact f it would be enough to check if the constraints state (p) ={ Unified }, state (f) ={ Active } and p ≡ f would make the problem inconsistent.
We call this arc-consistency based network, representing both constraints and causal relations, constrained causal graph. We would build such a graph incrementally, starting from the initial working memory. In analogy with classical planning, we might call action both the application of a rule and a unification of a goal with a fact. We might create a boolean variable a i for each of the actions that will be added to our graph. In addition, we might introduce a fictitious action, with an associated boolean variable a0, whose application would represent the resolution of the whole planning problem which, in this light, would become the problem of creating the preconditions for the application of this action. In the following we will use a i to refer both to the boolean variable and to the associated action.
It is worth noticing that these actions, in most of the cases, would be really similar to the classical planning actions. Indeed, we would try to benefit as much as possible from this analogy. In case of the application of a rule, for example, we might consider a classical action having as preconditions the body of the rule and as effect the achievement of the goal having the same predicate as the head of the rule. Similarly, in case of the unification, we might consider a classical action having as preconditions the fact with which the goal is unifying, together with the unification constraint and as effect, again, the achievement of the goal having the same predicate as the head of the rule. Finally, the fictitious action a0 might be considered as a classical action having as preconditions the initial requirements and as effect the solution of the planning problem. Since we want our problem to be solved, we would force variable a0 to be equal to true.
In building our graph, we would have different behaviors for actions representing rule applications and actions representing unifications. For each actions a i , representing the application of a rule, we would force the validity of the constraints in the preconditions of a i to be equal to the variable a i 8 . To this purpose, we recall we have reified constraints. For each fact p j in the preconditions of action a i we would add the constraints a i ⇔ state (p j ) = Active and ¬a i ⇔ state (p j ) = Inactive. For each goal p j in the preconditions of action a i , considering A (p j ) the set of actions a k that achieve p j , we would add the constraint a i ⇔ ExactlyOne (A (p j )). Now, for each action a k in A (p j ) representing an application of a rule, we would add the constraints a k ⇔ state (p j ) = Active and ¬a k ⇔ state (p j ) = Inactive. Similarly, for each action a k in A (p j ) representing a unification with an atom p u , we would add the constraints a k ⇔ state (p j ) = Unified ∧ state (p u ) = Active ∧ p j ≡ p u and ¬a k ⇔ state (p j ) = Inactive.
Similarly to our agenda, we would maintain a queue of goals to be solved and we will add new actions backwards until either there would be no goals or every goal, individually, could be achieved by means of an unification action. In other words, a goal which might be achieved through an unification action would be considered solved and all the subgoals within preconditions of the other actions for achieving the goal, at first, would not be considered for further expanding the graph.
It is worth to spend some more words about the latter case. Specifically, it should be noted that, in general, the fact that two goals, individually, might unify, for example, with the same fact, does not necessarily imply that, together, the two goals will actually unify. In general, it depends from the involved constraints. Furthermore, the addition of constraints during the construction of the constrained causal graph might invalidate previous unifications. In these cases, we should be able to consider again some of the subgoals initially excluded from the agenda in order to subsequently refine the graph. However, we would not consider, at first, these cases, and might use the graph for generating a heuristic which would tell us when and how to refine the graph. Whenever we find a goal which might unify with an eligible fact we would create a backtracking point, we would force the state of the fact to be Active, we would force the state of the goal to be Unified, we would check that the unification constraints propagate and, in so, we would consider the goal as solved. In any case we would restore the state of the arc-consistency network as it was before the creation of the backtracking point.
Another way to describe our actions is to consider them as the actions that the planner needs to perform in order to produce a valid plan. Although our action might take into account temporal aspects in their preconditions, they are to all effects impulsive actions like those of classical planning. In this light, we are interested in finding a sequential plan whose execution, by the planner, would lead to the generation of our desired plan.
Finally, it is worth highlighting some similarities of our causal graph with the classical
Using the constrained causal graph
The main aim of the constrained causal graph is to extract some values for an heuristic. Given the analogy with classical planning actions, the core idea is to use classical planning heuristics on the domain model represented by our actions. Specifically, we would use the constrained causal graph to extract a heuristic which is very similar to the h max heuristic described in [5]. Specifically, the cost of achieving a set of atoms C would be approximated by the cost of achieving the most costly atom in the set. We give a formal recursive description of this heuristic:
where f is an atom and A (f) is the set of actions for achieving the atom f. Roughly speaking, we would consider unifying goals as having cost zero. If different actions would achieve a goal, we would choose the less expensive action, however, the cost of an action would be given by the cost of the most expensive of its preconditions.
We might use this heuristic to guide the overall search process which, at this stage, might be seen as a simple search problem within the same constrained causal graph. We would have built a constraint satisfaction problem which would represent a solution for our reasoning problem. At the beginning, we would check the consistency of all the objects (e.g., the timelines) which might add further constraints to the graph. Then, we would choose the atom p having the highesth max (p) value and would remove the value Inactive from its state. Also, we would choose the action a k , among the set of actions A (p) achieving the goal p, having the lowest value for h max (pre (a k )) and would force its value to true.
Unfortunately, since the constrained causal graph structure would bring with itself an high level semantic from which the heuristics for the resolution of the constraint satisfaction problem (i.e., the whole reasoning problem) would be deduced, there is no hope, in any available constraint propagator, that such a behavior would emerge from its already existing heuristics. We have already tried this approach with standard solvers with not satisfactory results, therefore, a drawback of our situation is that we should discard available constraint based solvers in favor of solutions built from scratch.
Conclusions
This paper has introduced the
The
One final comment is worth being done: although applicable to any logic based problem, the proposed heuristics have been tested on classical planning problems. The reason for such a choice is twofold: (a) we are pursuing the idea of a domain-independent planner able to solve efficiently the wider spectrum of planning problems; (b) the planning community has plenty of benchmarking problems. Despite such abundance of problems, most of them do not take into account the temporal aspects of reality. We believe, however, that these problems might constitute a springboard for addressing more complex domains.
Footnotes
While SMTInterpol provides a pure Java implementation, MathSAT and Z3 provide Java wrappers to their native API. We have not found other SMT solvers that provide, directly or indirectly, a Java API.
This means, in general, sacrificing decidability.
De Morgan’s laws allow the expression of conjunctions and disjunctions in terms of each other via negation. Specifically, given two terms a and b, we have that ¬ (a ∧ b) = ¬ alor ¬ b and that ¬ (alorb) = ¬ a ∧ ¬ b.
Partial order and timeline-based planners commonly call such atoms “tokens”.
A more efficient implementation would use the same variable a i when building the constraints.
If needed, negations might be represented by adding a boolean argument to predicates representing the polarity of the atoms.
Acknowledgments
Authors work is partially funded by the Ambient Assisted Living Joint Program under the SpONSOR project (AAL-2013-6-118).
