Abstract
Acting in the real world may be a difficult task for an agent, either software or robotic, because unexpected contingencies may arise at any step of the execution. Previous approaches to robust plan execution consider propositional goals to be achieved and time constraints to be satisfied. However, realistic plans must obey to constraints on continuous/consumable resources, too.
To face the complexity in handling these resources, the paper proposes the notion of Multi Modality Action (MMA). The model allows to explicitly express the multiple execution modalities in which a given action can be executed; each execution modality models requirements/consequences on the involved consumable resources when that modality is selected. Relying on the MMA notion, the paper presents how the repair problem can be seen as a problem of reconfiguring actions modalities, and how it can be solved by exploiting a CSP encoding.
The MMAs are employed by a new continual planner, FLEX-RR, which, exploiting the synergy from the reconfiguration and a numeric planning mechanism, can efficiently repair on the fly the plan keeping it rather stable. An empirical analysis, performed on three numeric planning domains, confirms the large benefits of FLEX-RR in terms of competence, efficiency and stability of the repaired plan.
Introduction
Real-world scenarios are in general weakly predictable and highly dynamic. This means that unexpected contingencies may arise at any step of the execution. It is not thus surprising that, as also recently stated in [15], the problem of robust plan execution in real-world domains is a very challenging topic in AI planning, necessary for implementing planning mechanisms in autonomous systems.
A first methodology to cope with the problem consists in anticipating, at planning time, all the possible contingencies that might occur during the actual execution of the plan. The result of the planning phase is therefore a conditional plan [5,18], i.e., a plan where alternative courses of actions are possible depending on certain conditions of the environment. During the execution phase, the agent, guided by its sensing actions, is able to select a feasible branch of its plan that leads to the goal. A similar methodology has been proposed in [6]; in this case, conditional plans are built to guarantee that the agent satisfies temporal constraints on the achievement of its goals.
These approaches are successful when it is possible to anticipate all the contingencies at planning time (i.e., off-line). When such contingencies cannot be handled off-line, alternative on-line solutions are required. Many works [9,11,14,40] have recently proposed to enhance robust plan execution via plan repair techniques. Basically, these works suggest to stop the plan execution as soon as some unexpected situation is encountered, and then they attempt to repair the broken plan either via replanning from scratch or via plan adaptation ([13]).
However, adapting a plan can be very hard in the general case ([32]), and even undecidable when the domain involves resources modeled as numeric state variables ([16]). To handle the repair problem in a more efficient fashion, this paper addresses the robust execution from a different perspective: to limit as much as possible the need of replanning, we propose a new characterization of the repair given in terms of a reconfiguration.
Similarly to the numeric extension of PDDL ([10]), we start from the observation that real-world scenarios include both reusable and consumable resources. Thus, a plan has not only to achieve a set of goals, but it has also to conform to strict constraints on the amount of resources to be used by the agent. For this reason, it is necessary to explicitly model the (expected) profile of resource consumption during the execution of any action. We thus model resources as numeric fluents, which can be explicitly mentioned in the preconditions and effects of the action templates.1
Since version 2.1, PDDL allows numeric fluents in the definition of action templates.
We also observe that, in many real-world domains it is possible to identify actions performing the same task (i.e., obtaining the same effects), but requiring different configurations of the agent. These actions share the same qualitative objectives, while they differ in the way they are performed. Typically, different configurations have different resource profiles. We therefore propose to group these subsets of similar actions by introducing the notion of multi-modality action (MMA). An MMA represents in a compact form alternative ways, or alternative execution modalities, to accomplish a task. In planning terms, we would say that all the execution modalities of a given MMA reach the same set of propositional effects. However, since they are characterized by specific resource consumption profiles, they differ in their numeric preconditions and effects.
From the MMA formulation we obtain theoretical and practical computational benefits; in fact, while replanning is in some case unavoidable, it can be an excessive reaction in many situations as just reasoning over the action modalities can be sufficient to overcome exceptions. Moreover, the reconfiguration of the MMAs does not affect the causal structure of the plan. For this reason, if the unexpected deviations can be handled with an alternative configuration of action modalities, the plan remains quite stable. This is important when the agent is situated in a multi-agent setting since humans and/or artificial agents may have expectations on the agent’s task.
Relying on the notion of MMAs, the paper proposes a system, called FLEX-RR (FLexible EXecution via Reconfiguration and Replanning), which is an extension of the on-line approaches based on replanning. The approach is inspired to the continual planning paradigm [2,7] since it allows the agent to interleave (re)planning and execution. FLEX-RR aims at limiting the replanner’s intervention by singling out those situations which could efficiently be resolved via (a simpler) reconfiguration phase. On the other hand, when plan reconfiguration fails or is not sufficient, FLEX-RR starts a replanning task.
Since FLEX-RR encompasses both the reconfiguration task introduced in this paper and a pure replanning approach, it can deal with those situations solvable via replanning but not via reconfiguration. Thus, in the worst cases FLEX-RR is comparable to replanning from scratch. In practice, the reconfiguration can be sufficient to overcome resource consumption anomalies in several cases, as we will discuss in the experimental section.
Section 2 introduces the main concepts at the basis of the FLEX-RR strategy: the MMA model and the Dynamic Modality-Assignment Problem (DMAP), which formalizes the reconfiguration problem. Section 3 presents the implementation of the FLEX-RR strategy. An important aspect addressed in this section is the transformation of a DMAP into a Constraint Satisfaction Problem (CSP), and its resolution via a module called ReCon (ReConfigurator). An extensive set of experiments is reported in Sections 5 and 6; firstly we compare FLEX-RR with two alternative repair strategies: replanning from scratch and LPG-ADAPT ([13]). Then, we analyze the behavior of FLEX-RR and we test the scalability of the CSP solver in handling increasing DMAPs. Finally, we discuss the related works in Section 8.
This section formalizes the problem of reconfiguring a plan as a Dynamic Modality-Assignment Problem (DMAP). Before presenting the DMAP and its properties, however, we need to review some basic notions of planning from the point of view of the Multi-Modality Action (MMA) model we propose.
Modeling the world state
Inspired by the PDDL formalism [10], we abstract the world as a finite set U of physical objects present in the environment, and as properties and relations among such objects. These properties and relations can be both qualitative and quantitative, more formally:
(World domain).
The world domain is a triple
Numerical fluents are n-ary function symbols mapping n objects of a domain (or variables when used in action schema) to a real number.
F and X define the domain in which qualitative and quantitative properties can be stated. This means that the domains of F and X consist of all the possible instantiations of relations w.r.t. the objects in U.
Given the domain defined above, a system state specifies the qualitative and the quantitative properties of the objects in U, formally:
A state S is a pair
The fluents in F not mentioned in
Note, moreover, that
To simplify the notation, in the following we will denote the propositional and the numerical part of a state S by
In the rest of the paper, we will exemplify our approach by means of the ZenoTravel domain: a logistic scenario used in the competitions for testing the ability of planners in dealing with numerical fluents.3
Taking into account the notion of system state introduced above, we now discuss how the system state evolves as an effect of an action application; in particular, this subsection introduces the Multi-Modality Action model (MMA).
As anticipated in the Introduction, execution modalities are intended to point out the different ways in which the same action can be performed, e.g., using different devices and/or parameter configurations. The intuition is that the expected results of an action can actually be obtained in many different ways by setting how the action is configured. Obviously, different configurations have, in general, different resources and time profiles. Therefore, it may be possible that the execution of an action in a given configuration leads to a plan failure, whereas the same action performed in another configuration does not.
The MMA model extends the PDDL action model by allowing to express, within a single action schema, the different alternative modalities in which that action can be performed.
More precisely, an MMA splits an action model into a qualitative and a quantitative description. The qualitative description involves the propositional behavior of an action model and is independent of the modalities. Whereas the quantitative description involves the numeric fluents and is made dependent on the modalities associated with the action. The idea is to model how the selection of a specific modality causes variations (in numeric/metric terms) in the resource profiles, especially for consumable resources.4
To avoid confusion on the use of the term, with consumable we intend resources which can be produced and consumed.
Given the world domain
The numeric expressions we deal with are recursively defined as:
a real number in ℜ is an expression;
a numeric fluent in X is an expression;
given two expressions
As many other representation formalisms handling numeric expressions (see for instance [17] and [12]), also in our framework we limit ourselves to linear expression, for two main reasons: (i) to comply w.r.t. state of the art numeric planners, and (ii) for guaranteeing efficiency of the reconfiguration task, which makes use of a CSP encoding (see Sections 2.4 and 3).
Given the model in Definition 3, an MMA to be executed in a state has to take into account both of its propositional preconditions and the numeric constraints associated with the modalities of execution. More formally:
(MMA applicability).
Given a state S, an MMA a is said to be applicable in S iff:
there exists a modality
Note that, in general, an MMA a is applicable to a given a state S with different modalities. In the following we will denote as
(Successor state).
Given a state S and an MMA a applicable in S, the application of a to S with modality
The transition is deterministic and the order in which

Multi-Modality Action template for the
An important characteristic of the MMA definition given above is the neat separation between the propositional and the numeric part. This separation allows us to introduce the notion of propositional view of an MMA. More precisely, given an MMA a, the propositional view of a, denoted as
Also the notion of successor state can be restated in terms of the propositional view of MMAs. Given an MMA a and a state S, the propositional successor state
As we will see in the next section, the possibility of reasoning in such two levels of abstraction, allows us to characterize the repair problem as a Dynamic Modality Assignment Problem.
From our point of view, the ZenoTravel domain is quite interesting since in the original version of the model there are two actions (fly and zoom) that have exactly the same propositional effects, but different numeric effects. Indeed, the two actions have different impacts on the use of resources (in particular on the numeric fluents fuel, total_fuel_used and time-spent5
Note that, as we will see in the next section, since we deal with sequential plans of actions, we can handle the time spent by actions as a consumable (not renewable) resource. Therefore, the time can be modeled as a numeric fluent and its meaning has not to be confused with the notion of time-stamp in the context of temporal planning ([10]), or in planning via timelines ([4]).
These two actions have different preconditions, but it is worth noting that the propositional preconditions are exactly the same (the airplane must be in the source airport) and the only differences in preconditions are the ones related to numeric fluents. These two actions can be perfectly captured in our approach with a single action
It is easy to see that the propositional preconditions and effects are shared among all the modalities, while the preconditions and effects concerning numeric fluents are specific for each modality.
To make the domain more challenging in the context of MMA, we added to the original ZenoTravel domain two modalities for the board and debark actions, namely, normal and express which take different amounts of time (shorter for express) and different costs (lower for normal).
We are now in the position for introducing the notions of multi-modality plan and of multi-modality planning task.
Given the domain A is the set of possible MMA instances over Since we are interested in plan execution, we consider in this formulation instantiated actions only. However, our implementation supports action schema too. The extension is straightforward.
Note that our notion of MMPP is similar to the Numerical Planning Problem (NPP) proposed by Gerevini et al. [12], the main difference is that we adopt actions with modalities rather than simple actions. It is easy to see, however, that an MMPP can be translated into an NPP; it is sufficient to flatten each MMA a in A by generating a set of corresponding simple actions
In the following definitions, we use the notation
Given a MMPP
In other words, a sequence π of MMAs (specified with their modalities) represents a solution for Π when the execution of π transforms the initial state I into a final state
Relying on the notion of propositional applicability of the MMAs, we can also define a propositional solution for a MMPP as follows.
Given a MMPP
This means that a plan π is a propositional solution for the problem Π if it reaches the propositional goal
As previously anticipated, the execution of a plan π in the real world can be threatened by the occurrence of unexpected events such as variations in the resource consumptions, or assumptions that, made at planning time, become invalid at execution time. It is therefore essential to detect, while the plan is still in progress, any unexpected deviation from the nominal behavior.
Given an MMPP
π is valid at step i iff the plan segment
π is partially valid at step i, iff the plan segment
otherwise, π is invalid at step i: the plan segment
During the execution of the plan, it is necessary to determine the status (i.e., either valid, partially valid or invalid) just before starting the execution of each MMA involved. For this purpose, we do not consider just the effects of the last performed MMA, but we also verify whether the propositional goals
The assessment of the actual environment state after the execution of an action is, in general, a complex problem since the environment is, in most cases, not fully accessible, and only partial observations about the world are available. The problem of plan execution in partially observable environments has been dealt with in [27,36]. In this paper, however, we assume that the observability level is sufficient for measuring the effects of an action
In principle, whenever the plan π is no longer valid, a replanning mechanism should be invoked in order to find an alternative way to reach the desired goals. As we have seen, however, the clear distinction between the propositional and numerical aspects within an MMA allows us to distinguish between a completely invalid plan (which is no longer executable because some of the required propositional fluents are missing), and a partially valid plan which is just propositionally consistent but not numerically.
The idea is that, while an invalid plan can only be repaired by means of a (possibly expensive) replanning step, a partially valid plan can be repaired more effectively by reconfiguring its MMAs. That is, we try to reuse the effort made for the synthesis of π by adjusting the way in which the MMAs in π will be performed. We therefore propose to repair a partially valid plan not via a replanning step but via a reconfiguration phase, and to do this, we now formally introduce the notion of Dynamic Modality-Assignment Problem (DMAP).
(The Dynamic Modality-AssignmentProblem (DMAP)).
Let
The Dynamic Modality Assignment Problem Φ is a tuple π is a plan whose first i MMAs have been already performed; π is partially valid at step i;
(Solution of a DMAP).
The solution (if any) of a DMAP is a new plan for each MMA for at least one MMA
It is worth noting that the solution
Observing the DMAP definition, it is quite clear that the search space generated by the problem is exponential in the number of the MMAs still to be executed. Given
The problem is ‘dynamic’ as it can arise at any step of the plan execution, and assignments made at given step could be reconsidered at a following one.
Having solved the DMAP we are sure that the new MMA plan is a valid solution for the overall MMPP; that is:
Let the plan π be partially valid at step i w
To show that
Theorem 1 assures us that solving a DMAP does restore the nominal plan execution. The new plan
It is easy to see that the solutions space of a DMAP is a subset of the solutions space defined by MMPP. It is therefore easy to see that a solution for the DMAP, if any, is also a solution for the MMPP. However, the opposite does not hold, namely when the DMAP has no solution, the MMPP might instead have a solution.
The following theorem shows that, when the DMAP fails, the solutions to the new MMPP must differ from the original plan for at least one action.
Let
there is at least an MMA in
there is at least an MMA in
the order of MMAs in
The proof proceeds by contradiction. Let us assume that the solution found for
Moreover, by definition of
While Theorem 1 states that in some situations a MMPP can be solved as a DMAP, here Theorem 2 states that when a MMPP cannot be solved as a DMAP, then a solution
Another reason to anticipate the MMPP with the DMAP is that solving a DMAP is computationally easier than solving a planning problem. To understand why, let us remember that the classical planning fragment (i.e., STRIPS) is PSPACE-complete (see [3]). Despite there is no complexity result (to the best of our knowledge) on the numeric extension of the classical paradigm, the numeric planning is at least as complex as STRIPS. MMPP is representative of the numeric planning. In other words, the complexity class of the decision problem NUM-PLAN-EX,7
Let us refer to such a class by means of Class(NUM-PLAN-EX).
On the other hand, the DMAP characterization given in this section is less complex. In fact it is easy to show that:
Let ASSIGN-EX be the decision problem of determining whether an instance of a DMAP is satisfiable
The proof of the NP membership comes from the polynomial conversion mechanism discussed in the Appendix, where we show that a given DMAP can be transformed into a CSP, which is an NP-complete problem. That is to say, ASSIGN-EX ∈ NP.
Given the proposition above we can also state that:
If the hypothesis that PSPACE ⊃ NP holds
The important consequence of Corollary 4 is that solving a DMAP is simpler, from a computational complexity point of view, than solving an MMPP. This gives us a formal guarantee that, in the worst case, reconfiguring the modalities of an MMA plan is simpler than replanning from scratch. This theoretical result is also confirmed experimentally, as discussed in Sections 5 and 6.
So far, we have pointed out that a plan can be made partially valid or invalid by unexpected contingencies occurring during the execution phase. We have also suggested that, when the plan becomes partially valid, the agent can try to solve a DMAP to reconfigure action modalities. On the other hand, when the plan becomes invalid, the agent has to find a completely new course of actions (i.e., replanning from scratch).
In this section, we present the FLexible EXecution via Reconfiguration and Replanning (FLEX-RR) methodology, and discuss how it can be implemented by exploiting CSP techniques ([37]).
Before that, we provide some hints on how a DMAP is translated into a CSP (a more detailed discussion is reported in the Appendix).
CSP variables and constraints
In CSP-based classical planning [8,22], the CSP representation requires two sets of variables: one to model the actions, and the other to model the states generated by the action execution. Analogously, also the encoding of a DMAP into a CSP representation relies on two sets of variables:
As in CSP-based planning (based on a graphplan-like structure [1]), we duplicate the state variables as many times as there are “levels” in the plan π.8
Of course, since we are not solving a planning problem, but a DMAP, we do not need the step of graph expansion as the sequence of actions is already known.
To capture only those assignments of MODs which are consistent with the actions and the problem at hand, each state level is logically bound with the preconditions of the MMA belonging to that level l, and with the effects of the MMA at level
To limit the search space of the arising satisfiability problem, we consider as decisional only the MODs variable. As matter of facts, the numeric profile throughout the plan can be determined as the implication of the modality selection.
For the sake of clarity, Fig. 2 summarizes the variables and the constraints involved in a plan of length n; in particular the figure describes:
State Variables
Modality Variables
Intuitively, the superscript j of a variable

A CSP representation for a generic MMA plan of length n in a problem with z numeric fluents and k goal constraints.
Having defined the operative way to handle the DMAP using a CSP encoding, we can now introduce the continual planning approach the agent follows to adaptively perform its plan.
Algorithm 1 sketches the main steps of such a control loop. The basic idea is similar to the continual planning approach [2,7] in the sense that, at each execution step, the algorithm verifies whether the conditions to achieve the goal are satisfied or not. As innovation w.r.t. traditional continual planning systems, we distinguish between partially valid and invalid plans, thus we have different strategies for the plan adaptation.

FLEX-RR
At the beginning, the algorithm initializes two important structures: the
The algorithm iterates over the plan actions as long as there is at least one action to execute. The algorithm returns either Success or Failure depending on whether the status S reached after the execution of π satisfies or not the goal G. The iteration may be also interrupted when the replan mechanism is not able to find a solution. In that case the replan returns a plan of size 0, so the while condition is not satisfied and a Failure is returned.
At each iteration, the algorithm observes and updates the world state S (the senseAndUpdate function). Accordingly, also the
Of course, it is important to note that the propagation machinery must take care also of the precondition all along the plan being executed. The mechanism can be improved by exploiting the numeric kernel notion defined in [38].
In case the plan is invalid, a replanner is invoked to build a new plan from the current state S to the goal state G.10
To allow the interaction between our software module with a generic PDDL numeric planning (Metric-FF, [17] in our case), we apply the conversion mechanisms sketched in Section 2. When the planner is invoked, each MMA model is flattened to the traditional PDDL model; these models are therefore used by the planner. Then, once the plan is computed, the PDDL actions of the plan are newly transformed into MMAs.
In case the plan is partially valid, we have detected a DMAP problem, and FLEX-RR tries to solve it by invoking the reconfiguration module (ReCon). ReCon finds (if exists) an alternative assignment of the modalities to the actions still to be performed. Note that, since the DMAP problem might not have solutions, ReCon could return an empty plan; thus, also in this case, the algorithm will invoke a replanner to resolve the impasse.
The next two algorithms explain in detail: (i) how the CSP model is updated along the plan execution, and (ii) how the CSP solver is invoked.

SenseAndUpdate
Algorithm 2, SenseAndUpdate, takes in input the index i of the last performed action, the current state S of the world, and the CSP model to update.
First of all, new observations
In any other execution step (i.e.,

ReCon
Algorithm 3 shows the high-level steps of the ReCon module. In particular, ReCon has to solve a DMAP problem, and takes in input the plan π to be repaired and the CSP model which, as said above, already encodes all the pieces of information relevant for the solution of the DMAP. A solution consists in an assignment of modalities to the modality-variables that are not set to
ReCon tries to solve a DMAP by means of a CSP solver. If the CSP-solver finds a solution, this is extracted and used to reconfigure the plan π which is therefore returned to FLEX-RR. In other words, each action
An example of how FLEX-RR intervenes
To exemplify how FLEX-RR is able to monitor the execution of a plan and to repair it in case the plan is no more valid, let us consider a simple planning problem from the ZenoTravel domain. Our problem P involves 7 entities, among which: three persons (
Let us suppose that the following plan π has been provided as a solution for the planning problem above.
Before starting the execution, FLEX-RR builds the
Let us suppose that the execution of the first two actions in the plan does not raise any problem. However, after the execution of action
FLEX-RR invokes the propagation mechanism to assess whether the final state will satisfy the goal constraints despite this unexpected situation. Let us assume that the propagation step highlights that the fuel will assume a negative value, which of course is not consistent with our numeric goals. However, the propositional goals will be achieved the same in the final state S8. This means that, after the execution of
ReCon has to solve a new DMAP by finding a new assignment of modalities to the actions not yet executed. For instance, it finds the following reconfigured plan:
It is easy to see that the modality of
After these changes, the plan is again valid, and its execution can resume from the current state S3.
It is worth noting that the new allocation of modalities produces a new plan
In the previous example, FLEX-RR has been able to find a solution by means of ReCon without the need of replanning. This is not always the case. Let us suppose that the execution of
Also in this case the plan is partially valid, and therefore ReCon is invoked for solving a DMAP. However, since the deviation on the fuel is significant, it is no possible to restore the validity of the plan via a simple reconfiguration. Thus, ReCon fails and FLEX-RR invokes the replanner. The new planning task consists in finding a plan that, starting from the current state S3, achieves the same set of propositional and numeric goals. In this case the replanner finds a solution:
Note that the new plan segment has substituted the original plan. In particular, thanks to the introduction of a refuel action
Finally, it is worth noting that also the replanner may fail as not all the anomalous situations encountered during the execution are repairable. More important, in many real-world scenarios, the agent must react to unexpected situations in a short amount of time. That is, FLEX-RR has to find a solution within a given threshold; otherwise, the plan goals must be revised and a new plan must be synthesized.
Measuring the stability of the plan
In the previous section we have seen that in FLEX-RR both the ReCon and Replanning mechanisms can be invoked for repairing the current plan from a failure. One relevant question concerns how different the new plan is from the old (failed) plan. Obviously, the repair process must have produced a new plan that differs, even just slightly, from the old plan in order to overcome the impasse. However, not all the changes induced by the repair process have the same weight. Intuitively, changing just a modality in one MMA should be considered as a “minor change”, while the substitution of an MMA with another one should be considered as more intrusive. When the new plan is substantially different from the one synthesized off-line, the behavior of the agent becomes difficult to predict for an external observer, as for instance a human supervisor. As matter of facts, artificial agents (e.g., robotic systems) or humans may have expectations over the agent’s tasks, which may not be explicit in the action models. For this reason, there is an interest to have the repaired plan with the minimum amount of changes: this is particularly true in domains where the agent has to cooperate with others agents (both artificial or human) in order to achieve a global common goal. In fact, a relevant change in the services (and/or in the order) provided by an agent to other agents could cause a major reshaping in the plans also of other agents in order to continue to be able to achieve the common goal.
A first answer to the question is provided by Theorem 2, which guarantees that if Replanning finds a repair plan after ReCon has failed, then the new repair plan cannot have the same causal structure of the old one (i.e. at least an action has to be added or deleted, or the order of the actions has to be changed).
This result is relevant since it provides the basis for stating that the repair plans obtained via ReCon are more stable, since they preserve the causal structure of the original plan. However, in FLEX-RR both replanning and ReCon are invoked and it would also be useful to evaluate whether the insertion or deletion of a single action in the new plan has a more negative impact on the plan stability than the change of the modalities in many actions (as may happen as the result of ReCon).
To provide a formal basis for such an analysis, we propose a measure of the stability of a repaired plan, based on the notion of distance between two plans (i.e., the original and the repaired plans11
Other stability measures have been reported in literature, none of them unfortunately takes into account actions with different modality of execution ([9,11,33]).
Our stability measure relies on the distance
inserting ( replacing ( swap (
It is reasonable to assume that
The complete procedure is described in [39].
The stability measure defined by the above equation ranges from 0 to 1. In particular, when
Let us consider again our running example, and compute the stability of the repaired plans inferred by FLEX-RR (Fig. 3) and by REPLAN (Fig. 4), in the two situations described in Section 3.5.

FLEX-RR.

REPLAN.
Given that α (the insertion/deletion cost) equals to 5, γ (modality replacement cost) to 1 while θ (the swap cost) is set to 6, we have that:
FLEX-RR: the trivial transformation costs 50 (five deletes and five adds are involved), while the distance between the original and the repaired plans is 3 (3 changes of modalities), and hence the stability grade is 0.94, meaning that the two plans are very close to each other; REPLAN: the trivial transformation costs 55 (five deletes and six adds are involved), the distance between the original and the repaired plan is 9 (the add of a refuel action and 4 changes of modalities) and hence the stability grade is 0.83.
This means that the plan repaired via replanning is less stable than the plan repaired via ReCon.
To evaluate the performance of FLEX-RR, and in particular the contribution given by the reconfiguration mechanism, we compared FLEX-RR with two architectural variants: FLEX-REPLAN and FLEX-LPG-ADAPT. FLEX-REPLAN differs from FLEX-RR in that it invokes a replanner not only when the plan
The three architectures have been assessed with respect to three parameters:
Competence: the capability of a strategy of finding a repair plan when the original plan becomes partially valid during the execution. In particular, we measure the competence as the rate of successes in recovery from the unexpected contingency. In our experiments we allotted the various architectures 240 s of CPU time, which is a threshold within which a solution must be provided. In the case of FLEX-RR, we set the time for ReCon module up to
Further experiments have been run for different ratios or by allotting different amount of CPU time. These experimental results (see http://www.di.unito.it/~scala/software) show a general trend very similar to the one we will deeply discuss in this section.
When ReCon proves that a DMAP has no solution in less than 24 s, the time not used by ReCon is given to Replan.
Efficiency: we compare the CPU time of the three architectures. In principle, we will expect that the computational effort will be lower when the cases are solved via reconfiguration, as the search space generated by the DMAP is much more limited.
Stability: we are interested in understanding how the systems impact the structure of the plan to be repaired. In the following experiments we used the same setting of weights reported in Section 4.2. Relying on Theorem 2, we expect that, when the plan is repaired by ReCon, we will keep the plan quite stable, since the causal structure remains unchanged. When the contingency is solved via Replan, we expect a stability reduction since actions can be added/removed freely and also their order can be changed. On the other hand, since LPG-ADAPT is built with the concept of stability in mind, we expect that FLEX-LPG-ADAPT turns out to be quite competitive in this perspective.
Tests were conducted in three numeric domains, extended to support actions given in form of MMA. That is:
ZenoTravel
DriverLog
Planetary Rover
The first two domains are from the Third International Planning Competition;15
For each domain we generated a set of problems varying the number of objects to be considered. For a given problem we generated a plan whose execution is handled by means of a software simulator. Here, numeric fluents used to model time and consumable resources have been noised in order to reproduce unexpected contingencies. More precisely, the simulation of the deviations on the expected state of the system is obtained by altering the impact of each action executed. In this way, we collected a series of situations where the plan became partially valid at least once during its simulated execution.
Each parameter (competence, efficiency and stability) has been measured w.r.t. four degrees of noise. The first degree alters the resource consumption of 25% more than expected, the second of 35%, the third of 50%, and the last one of 75%. We aim at understanding how the noise impacts the performance. Intuitively we expect that, as long as the noise increases, the chance of finding a solution decreases, especially for ReCon. Thus the noise degree could have an impact both on the competence and on the stability.16
However, as studied in [30], one of the main factors impacting the performance of numeric planning is the strictness of the constraints. This means that, an increase of the noise impacts not only ReCon, but also Replan and LPG-ADAPT because the numeric problem to be solved becomes more complex (see [12,17]).
For all the domains tested, cases have been split in two classes of difficulty, each determined by the strictness of the constraints. The idea is to assess the behavior of the system w.r.t. the combination of the difficulty of the problem and the amount of noise injected.
In our experiments we have not taken into account situations in which the plan became invalid, since we were interested in evaluating the ability of FLEX-RR in repairing partially valid plans, i.e., when the reconfiguration can be exploited.
FLEX-RR has been developed in Java; it receives in input the domain, the problem and (optionally) the plan description written in an extended version of PDDL 2.1 level 2, which incorporates the notion of modalities. FLEX-RR exploits and extends a PDDL Java Library, namely PPMaJaL.17
As a CSP solver we used Choco 2.1.418
Choco is a Java library for constraint satisfaction problems (CSP) and constraint programming (CP). Visit
FLEX-REPLAN and FLEX-LPG-ADAPT inherit the same Java implementation of FLEX-RR, and they exploit, respectively, Metric-FF (in a similar way to FLEX-RR) and the LPG-ADAPT system developed for OAKPlan [13]. LPG-ADAPT is used as a black box and, to activate the adaptation capability given the encountered discrepancy, we invoke the system by setting as input the remaining set of flattened PDDL 2.1 actions to be executed.
Experiments ran on a 2.53 GHz Intel(R) Core(TM)2 Duo processor with 4 GB (under the operating system: Ubuntu 10.04).

DriverLog domain: competence for easy (left) and hard cases (right). (Colors are visible in the online version of the article;

DriverLog domain: stability for easy (left) and hard cases (right). (Colors are visible in the online version of the article;
For the DriverLog domain, we collected an amount of 1442 cases, equally subdivided into easy and hard cases. Our domain definition differs from the one of the planning competition in that we have two modalities for loading packages (i.e., normal and fast), and two modalities for driving trucks (again, normal and fast). Intuitively, an action performed in normal mode is cheaper in terms of resource consumption, but slow. Whereas an action performed in fast mode is more expensive but faster.
The generated problems vary on the number of packages, locations and trucks to consider. The difficulty of a problem is determined by the involved constraints. More precisely, the problems in the easy set have just a constraint on the total time of plan execution, while problems in the hard set have a further constraint on the maximum amount of power spent. The length of the produced plans ranges from 14 to 90 MMAs.
Competence
The histograms of Fig. 5 reports the competence of the three tested systems. The performances have been measured by considering the percentage of cases solved by a system within the deadline of 240 s. The cases are organized according to the degree of injected noise, and to their complexity (i.e., easy or hard).

DriverLog domain: efficiency for easy (left) and hard cases (right). (Colors are visible in the online version of the article;
By observing Fig. 5, it is quite clear that FLEX-RR outperformed both FLEX-REPLAN and FLEX-LPG-ADAPT in all the given scenarios. In the easy set, FLEX-RR was able to repair the plan in almost all the considered cases; a little decrease of the performance (96.17%) is observable just for the highest degree of noise.
Regarding the easy cases, also FLEX-LPG-ADAPT turned out to be quite good. The competence ranges from 92% to 96% in the first three degrees of noise; only when the noise degree is 75%, FLEX-LPG-ADAPT is not comparable with FLEX-RR (79.23% vs. 96.17%).
As refers to FLEX-REPLAN, it is evident that its performance is quite negative. In fact, just in the first two degrees of noise, FLEX-REPLAN repairs more than 70% of the cases.
Considering the hard cases, the gain between the performance of FLEX-RR and the other two systems becomes more evident; in particular, for noise degree 35%, FLEX-LPG-ADAPT is almost 30% less competent than FLEX-RR, whereas FLEX-REPLAN is more than 60% less competent than FLEX-RR.
FLEX-RR is therefore the most competent in the DriverLog domain. FLEX-LPG-ADAPT performed better than REPLAN, but it is not competitive with FLEX-RR, in particular for the hard cases set.
Figure 6 shows the average stability. The set of cases considered by this evaluation refers to those which have been solved by all the systems. As reported in Section 4, the stability ranges from 0 to 1, where 1 is the maximum score and the 0 value stands for a repaired plan that is completely different from the original plan. As for the competence, we studied the stability by considering separately the two sets of easy and hard cases, and the different degrees of noise.
Results in Fig. 6 show the efficacy of FLEX-RR in keeping the repaired plan stable. FLEX-LPG-ADAPT preserves to some extent the structure of the previous plan, and is more effective than FLEX-REPLAN.
FLEX-RR reaches a stability rate above 0.9 in almost all the scenarios, except that in the hard cases with noise degrees 35% and 75%. As we will discuss in the next section, this result is due to the contribution of the replanning mechanism in the FLEX-RR resolution task.
Efficiency
This evaluation analyzes the CPU time spent by the system in providing an answer, which may be either a positive one (a solution has been found) or a negative one (there is no solution for the problem at hand).
As known in the planning community (see for instance [23]), in most cases the distribution of CPU times cannot be synthesized in few parameters. For this reason, we describe the distribution by reporting the percentage of cases that have been handled in a given interval of time. We have introduced 6 time intervals which partition the interval 0–240 s, plus an additional interval for capturing the timeout of the system. The time has been measured in milliseconds.
The results are in Fig. 7, which consists of three sub-tables, one for each system. Each row in the table reports the percentage of cases “with answer” in the time intervals above plus the timeout, discriminating between the easy and hard cases, and among the different degrees of noise. To ease the readability of the table, color shades from grey (low percentage) to orange (high percentage) are used to highlight the intervals where most of the cases are answered.
By observing the results, we can see that FLEX-RR solved the majority of the easy cases in the first interval of time. This means that the system has provided an answer in less the 100 ms. Whereas FLEX-REPLAN and FLEX-LPG-ADAPT, in most of the cases, required significantly longer. Both systems suffered from the increasing of the noise, and even for the lowest degree of noise (25%), both FLEX-REPLAN and FLEX-LPG-ADAPT needed often more than 5 s. By observing the timeout columns we can understand the reason of the low competence of FLEX-REPLAN. In fact, even for the 25% of noise, FLEX-REPLAN did not provided an answer for the 24% of the tested cases.

ZenoTravel domain: competence for easy (left) and hard cases (right). (Colors are visible in the online version of the article;

ZenoTravel domain: stability for easy (left) and hard cases (right). (Colors are visible in the online version of the article;
In the hard setting, the FLEX-RR efficiency remained rather good; most of the tested cases have been solved in less than 1 s for all the noise degrees, making exception for the highest degree where, in the 31.15% of the cases, FLEX-RR reached a timeout.
Proportionally to the increase of the noise, FLEX-LPG-ADAPT became more and more time demanding, ending with 49% of timeout in the 75% setting. FLEX-REPLAN performed even worse reaching the timeout in more than the 65% of the cases in all the four degrees of noise.
In our version of the ZenoTravel domain we have two modalities of execution for the board and the debark, and two modalities for the fly (as in the original ZenoTravel formulation). Besides the standard behavior, the board and the debark have a second execution modality called “express”. The standard modality is cheaper but slower, while the express modality is more expensive but faster. The fly action is already described in Section 2.2.1.
We collected 1036 test cases (518 hard and 518 easy). The generated problems vary from each other on the number of passengers and locations; whereas the difference between the easy cases and the hard ones relies on the number of constraints encoded in the final goal. In particular, the easier cases are constrained in terms of total_time_spent while the harder ones add a (further) constraint on the total amount of fuel to be used. Plans involve up to 57 MMAs.
The total number of cases is obtained by running a given problem and plan over the four amounts of noise.
Competence
Figure 8 shows the competence of the systems.
By observing the histogram, for both the easy cases and the hard ones, FLEX-RR has been more competent than both FLEX-REPLAN and FLEX-LPG-ADAPT. In particular, for the easy cases, the amount of cases solved with success ranges from 62% to 87% for FLEX-RR, whereas between 57% to 77% for FLEX-REPLAN, and 42% to 61% for FLEX-LPG-ADAPT. For the hard cases the advantage of FLEX-RR w.r.t. FLEX-REPLAN increases for the first three degrees of noise; we can observe in fact a gap around the 15%.
As expected, the competence of the tested systems decreases as long as the noise increases.
It is interesting to see that in the last degree of noise FLEX-RR performs as FLEX-REPLAN. As we will see in the next section, in this domain (in the 75% noise setting, hard cases) most of the merit in FLEX-RR is due to the replanning strategy.

ZenoTravel domain: efficiency for easy (left) and hard cases (right). (Colors are visible in the online version of the article;

PlanetaryRover domain: competence for easy (left) and hard cases (right). (Colors are visible in the online version of the article;
Figure 9 reports the stability measured in the cases solved successfully by all the systems tested. Results proves the ability of FLEX-RR in keeping the structure of the plan quite stable, however the system that produced more stable plans in this setting was FLEX-LPG-ADAPT. Of course, the comparison takes into account only a portion of the cases, because in several cases FLEX-LPG-ADAPT reached the timeout.
Observing the competence and the stability results, it is worth noting that FLEX-LPG-ADAPT is able to solve a case, in particular, for those situations in which the solution is quite close to the starting plan. On the other hand, when the solution is rather distant to the original plan, the performance of FLEX-LPG-ADAPT tends to decrease.
As expected, FLEX-REPLAN often finds solutions that are quite different from the original plan.
Generally, the histogram reveals the (quite obvious) relation between stability and noise. That is, as long as the noise increases, the structure of the plan cannot be preserved, and even FLEX-RR produces less stable repaired plans. This is evident especially in the highest degree of noise, where the stability of FLEX-RR is very close to the stability of FLEX-REPLAN.
Efficiency
In Fig. 10 we can see the CPU time spent by the architectures to handle the partial validity of the plan over all the noises considered.
In each degree of noise, the distribution of cases solved by FLEX-RR (making exception for the first interval of the time with noise degree 25%, where in the 56% of cases have an answer in less than 100 s) has been rather uniform. Of course, the population shifts to the right of the table accordingly to the increase of the noise.

PlanetaryRover domain: stability for easy (left) and hard cases (right). (Colors are visible in the online version of the article;

PlanetaryRover domain: efficiency for easy (left) and hard cases (right). (Colors are visible in the online version of the article;
Concerning the increase of timeout situations, a similar trend can be found in the behavior of FLEX-REPLAN. In fact the timeouts grow from 16% to 31%. As a difference between the two strategies, we noticed that, when FLEX-REPLAN can solve the task, the resolution process is very fast. Otherwise, FLEX-REPLAN does not provide an answer within the time threshold.
As far as FLEX-LPG-ADAPT is concerned, the low level of competence that we measured before can be explained by the significant number of timeouts reported in Fig. 10, especially for the hard cases.
In our version of the PlanetaryRover domain, presented in [25], the most of the MMAs has two or three execution modalities.
In total, we generated 906 test cases: 403 easy cases and 403 hard cases. Problems vary on the number of locations to be visited and the number of pictures to be taken. The hardness of the problem is controlled by the number of constraints specified in the goal set. In particular the easy case constrains the total time and the power of the rover, while a hard case specifies a further constraints on the communication cost of the mission. In particular, as we will see, this last constraint makes the problem very hard for a generative approach.
The plans involve up to 34 MMAs.
Competence
Figure 11 shows the competence of the three architectures in the easy and the hard cases. FLEX-RR has outperformed the other systems for all the degrees of noise. In all the noise degrees of the hard cases, FLEX-RR has a competence ranging from 84% to 95%; whereas, FLEX-LPG-ADAPT and (still worse) FLEX-REPLAN have very poor performances. In particular FLEX-REPLAN never reaches the 16% of successes (this happens only for the 25% of noise), and FLEX-LPG-ADAPT remains around the 30% of successes; just with the noise degree of 35% the LPG-ADAPT based architecture overcame the 40% of successes.
Stability
The stability of the plans repaired by the systems is reported in Fig. 12. Also in this domain FLEX-REPLAN and FLEX-LPG-ADAPT are not comparable with FLEX-RR, which outperforms the other two systems. The plans produced by FLEX-RR have always a stability rate above 0.9 in every noise degree and both for the easy and the hard cases. Unexpectedly, the plans produced by FLEX-REPLAN are more stable than the plans produced by FLEX-LPG-ADAPT.
Efficiency
Similarly to the previous domain, the efficiency results are measured by Fig. 13. The most of the cases solved by FLEX-RR has been faced in less than 100 ms, for the easy cases, and 1 s for the hard cases. FLEX-REPLAN performed quite well in the easy cases, but not so well in the hard cases (see the high number of timeouts). Therefore, also in terms of efficiency, neither FLEX-REPLAN or FLEX-LPG-ADAPT has been competitive with FLEX-RR.
Experimental results: in-depth analysis of FLEX-RR
FLEX-RR success and failure situations
FLEX-RR success and failure situations

ZenoTravel domain: in-depth analysis. (Colors are visible in the online version of the article;

PlanetaryRover domain: in-depth analysis. (Colors are visible in the online version of the article;
Analyzing the FLEX-RR strategy reported in Algorithm 1, we can see that, while the recovery of an invalid plan can be attempted just via a replanning phase (or via an external plan-adaption tool), the task of repairing a partially valid plan could involve either just a reconfiguration step (when ReCon finds a solution), or the combination of reconfiguration plus replanning (when ReCon fails). As we have seen, this synergy is beneficial in terms of competence, efficiency and stability of the repaired plans.
However, an important question concerns the relative merit of ReCon and Replan in determining the final outcome of FLEX-RR.
In particular the ReCon (and Replan) outcome can be:
Solved: the DMAP (or the MMPP) has been solved and a solution is provided;
No Sol: the DMAP (or the MMPP) was proven to be unsolvable;
Timeout: ReCon (or Replan) did not provide any answer given the time deadline.
Therefore we can have 7 possible configurations, each of which corresponds to a Success or a Failure of FLEX-RR. These configurations are summarized in Table 1.
By considering all the possible outcomes of ReCon and Replan, Fig. 14 shows how the cases have been handled respectively for the easy and the hard cases of the ZenoTravel domain, over all the degree of noise considered (25%, 35%, 50%, 75%). The set of cases analyzed is the same used in the efficiency setting (see Section 5).
As expected, the percentage of cases solved just by ReCon decreases as the noise increases. This percentage ranges from 26% to 57% in the easy cases, and from 6% to 39% in the hard cases (these data explain why the FLEX-RR and FLEX-REPLAN have almost the same competence for the 75% noise setting in the hard class, see Section 5). Of course, the cases where ReCon fails (both for No Sol and timeout) are redistributed in the remaining 6 situations. It is worth noting to see that many of them are actually covered by the Replanning system, so the FLEX-RR final outcome is anyway a Success. For instance, in the easy setting, we can see that the percentage of cases for which ReCon proves the unsolvability of the DMAP and the Replan finds a solution increases from 20% to 24%, and, similarly in the hard cases such a range is on the average 17%.
The Fig. 14 makes evident that the key for the success of FLEX-RR is the synergy between ReCon and Replanning. Of course such a hybrid behavior comes to a price, which is highlighted by the union of the situations where the ReCon fails and the Replan is successful. It is worth noting that the overhead in CPU time payed by FLEX-RR in such a case is quite modest, since it does not impact neither the overall competence and the efficiency w.r.t. the other configurations (see Fig. 8 in Section 5). In particular, each case solved by FLEX-REPLAN has been solved by FLEX-RR as well.
Another interesting aspect arising from these results is that, for several cases where FLEX-RR did not provided any solution due to the reaching of the time deadline, ReCon was however able to prove the absence of an alternative reconfiguration of modalities. This is made evident by looking at the percentage of cases relative to the combination “ReCon No Sol” and “Replan Timeout”. While a system equipped just with the Replan capability would have been not informative at all.
Figures 15 and 16 report the result of the in-depth analysis of FLEX-RR in the domain of DriverLog domain and the PlanetaryRover one respectively. Not so many comments are necessary since in these domains most of the merit is of ReCon, in that in just few cases FLEX-RR had to use the replanning mechanism. It can be observed that in domain DriverLog, hard cases (Fig. 16), there is a relevant number of cases for which ReCon outcome was No Sol and the Replanning ended with a timeout; this shows that ReCon is quicker in detecting unsolvable cases than Replan.
We have also analyzed possible causes why the behavior of FLEX-RR in the ZenoTravel domain is somewhat different with respect to the other two domains. One distinctive feature of the ZenoTravel domain concerns the presence of the refuel action which can renew the consumable resource fuel when the airplane is at a given airport. The presence of a refuel is a very powerful tool for a “generative” approach (as the ones based on a plan adaptation), while it is completely useless for ReCon since it cannot alter the causal structure of the plan. It may be hence the case that, while ReCon spends time in searching for a new configuration of the action modalities, the insertion of a refuel action could be sufficient to repair the plan. This is probably the cause of several situations where ReCon ended with a timeout followed by the success of the Replan. Nevertheless, we also observed that, when the goal involves a constraint on the maximum amount of fuel that can be altogether consumed (e.g., because of the plan has to preserve a bound on the cost of the mission), the refuel action is no longer crucial to repair the plan. This is showed by the decrease of “ReCon Timeout” + “Replan Solved” situations in Fig. 14, where the hard cases are considered.

DriverLog domain: in-depth analysis. (Colors are visible in the online version of the article;

ZenoTravel domain: scalability. (Colors are visible in the online version of the article;
One of the factors which can represent a barrier for the performance of the reconfiguration is obviously the number of the actions in the plan. Actually, in Section 2 we have seen that the search space of a DMAP is exponential in the number of MMA involved in the reconfiguration. Therefore, the actual critical factor is not the length of the plan per se, but the length of the DMAP. So it is important to analyze at what extent the CSP is able to deal with (very) large search space when the length of the DMAP increases.
To measure such a parameter, we grouped the test cases of the previous section w.r.t. the number of the actions the DMAP consists of, and with respect to six intervals of time (0–100 ms, 101–1000 ms, 1000–5000 ms, 5000–24,000 ms, timeout). Figures 17, 19 and 18 summarize the results obtained in the three domains under examination. The cases are grouped in subsets of DMAPs of a given length (0–10 actions, 10–20 actions and so forth). This analysis concerns hence just the reconfiguration, and the results refers to both the cases solved successfully by ReCon and the cases for which ReCon proved the unsolvability of the DMAP.
It is evident that in the DriverLog and PlanetaryRover domains ReCon performs quite well, even for DMAPs involving a large number of MMAs. More precisely, in DriverLog the efficiency of ReCon decreases just for the interval 60–70 actions where 19% of the cases required a CPU time between 1–5 s. In the PlanetaryRover domain the majority of the cases required no more than 1 s, and this is true for cases involving up to 40 actions. In the last subset of cases (40–50 actions), the performance remained quite good (in fact 65% of the cases were solved in less than 100 ms), however a significant number of tests exceeded the time deadline (13%). ReCon behaves differently in the ZenoTravel domain. In fact, for short DMAPs (up to 10 actions) most of the cases are solved in less than 100 ms (i.e., 77% of cases). As expected, the ability of ReCon in solving DMAPs reduces as long as DMAPs get longer. In particular, when DMAPs involve more than 20 actions, the number of timeouts becomes relevant. These results motivate the behavior of FLEX-RR showed in the previous section. In fact, it has to take advantage of both ReCon and Replan. Whenever a plan is repairable by means of a reconfiguration of action modalities, FLEX-RR exploits ReCon to efficiently find such a new configuration. On the other hand, when a plan cannot be repaired by changing action modalities, FLEX-RR quickly switches to Replan, which can exploit all the power of a generative planning approach. Note that the quickness in the switch from ReCon to Replan is guaranteed by the fact that either ReCon proves that the DMAP is not solvable, or ReCon consumes all its 24 s, which can be considered an acceptable amount of overhead at this level of abstraction for most real-world applications.

PlanetaryRover domain: scalability. (Colors are visible in the online version of the article;

DriverLog domain: scalability. (Colors are visible in the online version of the article;
The MMA model reported so far has been formulated by assuming that (at this planning level) it is reasonable to abstract the behavior of the agent into two distinct levels: the higher describing what the agent does, and the other expressing the “way” in which each action of the plan is executed. This distinction has been reflected in a clear division between the propositional/qualitative schema of the top level and the numeric part referred to the execution modality level.
However, by observing the framework and the CSP computational model reported in the Appendix, it is easy to see that this restriction can be easily relaxed. In some scenarios in fact one would like to constrain the applicability of an execution modality not only to the availability of resources, but also to particular contextual conditions. While some of contextual conditions can be naturally expressed using the same numeric formalism (e.g., the movement of a vehicle is allowed only when the snow does not trespass the 20 cm), others could be much more naturally represented with qualitative constraints, i.e. with propositional constraints (e.g., the vehicle has to be provided with snow chains to follow this road in that execution modality).
To this end it suffices to extend the Definition 3. In particular each modality model should be modified with the tuple
The CSP extension at the basis of FLEX-RR is also straightforward; it suffices in fact adding a further constraint (for each modality containing also at least a propositional precondition) all along the (affected) snapshots representing the possible trajectories of states (i.e., the ones prior to the action execution); in addition, the CSP snapshots have to be enriched with the relevant atoms, to understand whether the modality precondition is satisfied. It is important to remember that the CSP handled online has to intercept all the (relevant) changes in the environment, even if they are uncontrollable or completely unexpected.
It is worth noting that the extension inherits most of the theoretical properties provided so far, so we expect that the competence, the stability and the efficiency will not be compromised.
Related works
The work presented in this paper adopts an action-centered philosophy (in line with PDDL, [10]), differently from a timeline based one ([4,28]). For this reason, our related works discussion will focus mainly on the relevant literature in this field of research. In particular, three topics are relevant for FLEX-RR: plan repair, CSP-based planning, and plan stability. In the following, we present some reference works in these areas and compare them with our approach.
Plan repair strategies have recently been studied in a number of works (see, e.g., [9,11,14,40]).
van der Krogt et al. [40] suggest that plan repair can be seen as a two-step procedure. First, the broken plan is unrefined by removing the actions causing plan flaws. Second, the resulting plan is repaired via a refinement step (see [19]) that adds actions to fill the gaps left during the previous phase. The proposed algorithm looks for a solution in the space of possible plans. In case the refinement step does not find a solution, a backtracking occurs and the unrefinement procedure is invoked again.
In [14], Gerevini et al. consider a planning graph structure, and assume that flaws affect different steps of a given plan. The proposed repair strategy solves separately each flaw: first it individuates the relevant plan “window”, i.e., the portion of the plan containing the flaw, and then it considers the window as an independent planning task. If the planning task has no solution, the window is enlarged to include one more action, and the process is repeated. In the worst case, all plan is included within a repair window, and the strategy behaves as a replanning from scratch procedure.
Garrido et al. [11] propose a mechanism to select between repair and replanning. In fact, they heuristically estimate the cost of plan repair (i.e., building just a bridge plan) and replanning from scratch (i.e., building the whole plan to reach the goals), and then solve the least expensive planning task.
The approaches mentioned so far represent a significant step ahead in understanding how plan adaptation can be applied in practice to the plan repair problem, despite contrasting complexity results on the topic ([32]). However, they present limitations when plan repair is to be performed on-line and has to take into account consumable resources and limited amount of time. In fact they are mainly designed for the classical/propositional fragment (making exception for [11] which considers the temporal planning too), and do not provide any theoretical bound on the plan repair task to be faced. Our FLEX-RR methodology overcomes these limitations, both theoretically, by providing a plan repair formulation given in terms of reconfiguration which is computationally simpler than a replanning approach, and practically by providing an effective architecture which combines competence and efficiency in a unified framework.
Another recent work strictly related to the plan repair problem is [24]. This work, however, is mainly focused on the problem of diagnosing the anomalous execution of a plan. In particular, the diagnosis identifies the root causes for an action failure. These causes are subsequently used to drive a replanning mechanism, based on a conformant planner, aimed at repairing the plan. The main disadvantage of this approach is that the conformant planning phase could be very expensive. The approach could therefore exploit FLEX-RR to repair a plan, at least in same cases, via reconfiguration rather than via replanning.
It must be noticed that, while the approaches mentioned so far do not consider resources, other works ([6,20,34,35]) focus on the temporal dimension. In these works, the notion of robust execution is close to the idea of flexible schedule. The rationale is that, if an agent can arrange its tasks along the time, then the actual execution of the plan is less prone to be affect by unexpected situations. Thus, these works are mainly focused on finding a schedule of the plan actions that guarantees some flexibility to the agent. The problem solved by these works is therefore slightly different from ours. In fact, while they are off-line (i.e., the schedule is synthesized before the plan execution phase), our work is on-line (i.e., a DMAP occurs when the plan execution is in progress). In addition, they are mainly focused on time delays, whereas we deal with exceptions that could threaten any consumable resource, included the time.
The work by Coles [5] is another example of off-line approach in which the planning phase tries to anticipate alternative scenarios that might occur at execution time. The result of this planning phase is a branched plan, where each branch “opportunistically” reaches a rewarded goal. It is interesting to note that new plan branches are created depending on the amount of agent’s resource, which are modeled as numeric fluents as in our case. As said above, however, our methodology is on-line and it has not to rely on scenarios anticipated off-line.
FLEX-RR relies on a CSP solver for resolving the DMAPs that possibly arise during the execution. This requires us to transform the planning domain under consideration into an appropriate CSP model. The work done in the planning community for solving classical planning problems via CSPs (see, e.g., [8,22]) has provided us a solid base for developing our translation schema. In the CSP planning encoding, variables represent actions and facts, while constraints are intended to allow only sequences of actions that are valid with respect to the domain and problem specification.
In our case, however, the CSP model is slightly different. Our formulation focuses in fact on the relation between action modalities and resource profiles. Resources are modeled as real variables, which can interact with one another according to the action models (e.g., the power spent by a communication depends on the amount of memory to be transmitted, which in turn depends on the performed sensing actions). It is hence evident that, while our transformation inherits the logical framework of the classical CSP-based planning, a pure propositional conversion is not sufficient to capture our representation.
Finally, in our approach we have also taken into account the notion of plan stability. This important feature has already been considered in previous works. In particular, the work by Fox et al. [9] introduces a measure of the plan stability in terms of common actions. As in our case, the metric is based on the notion of distance between two plans: given two plans π and
Recently, LPG-ADAPT, which is the system developed for [9], has been extended for supporting numeric fluents ([13]). LPG-ADAPT is designed to work in an off-line context, but can be exploited in a continual planning loop as well. For this reason, we integrated the LPG-ADAPT for an alternative configuration of FLEX-RR that we have called FLEX-LPG-ADAPT. Such a configuration substitutes the reconfiguration and the replanning step via the adaptation mechanism presented in LPG-ADAPT. However, as we have seen in the experimental section, in dealing with unexpected resources consumption, this architecture has not been competitive with FLEX-RR due to the lack of knowledge on the MMAs structure. Of course, when the plan to be repaired needs a major revision, FLEX-RR can be easily extended to invoke, at least in some case, the LPG-ADAPT system instead of the replanning from scratch.
Conclusions
This paper has addressed the problem of robust plan execution for planning tasks involving (complex) constraints on a number of resources. In coping with this problem, we have considered that the agent (i.e., the plan executor) might require not only reusable resources, but also limited, consumable ones; in our approach, in line with action-centered formalisms, we modeled such reusable and consumable resources as numeric fluents. To the best of our knowledge, previous approaches to robust plan execution that explicitly deal with resources are mainly off-line (e.g., [5,6,20,34,35]). On-line methods (e.g., [9,11,14,24,40]), instead, are mostly focused on the achievement of the (propositional) plan goals by interleaving plan execution and (re)planning.
The basic contribution of this paper is the notion of a Multi Modality Action, an extension to the PDDL formalism allowing to encode in the same model of the action the several execution modalities in which such an action can be carried out. Each execution modality models requirements and consequences of executing the action in that execution modality w.r.t. a given set of resources. While all modalities of a given action achieve the same propositional effects, a given modality models “how” these effects can be achieved. According to the Ghallab et al.’s terminology [15], modalities are intended to point out the alternative operational models their selection implies. The underlying hierarchical relation defined by the MMA model is an attempt to bridge the gap between the descriptive nature of the PDDL language and the operational representation leading the execution of lower level actions.19
For details on how modalities can be used for controlling the actual execution see also [26].
The combination of these two levels of representations allows a new (non-trivial) characterization of the repair problem, formalized as Dynamic Modality Assignment Problem (DMAP). The DMAP is the reconfiguration task of a given plan of actions expressed in terms of MMAs. This formulation has interesting theoretical properties that provides, on the one hand, a guarantee on the computational complexity bound limiting the reconfiguration in the NP class (while a replanning task is at least as difficult as a planning problem, which is PSPACE-Complete, see [3,32]), and on the other hand, a kind of repair that keeps rather stable the causal structure of the plan. This last property is important especially when the agent has to cooperate with others. High levels of stability, in fact, guarantee that the agent will provide the other agents with the services they are waiting for.
Relying on the MMAs notion and the DMAP formulation, the paper presented the FLEX-RR architecture, which is a continual planner exploiting the synergy of the reconfiguration (i.e., the DMAP) and of a replanning strategy. The objective is to address unexpected variations in the consumption of resources by trying first of all to reconfigure the plan actions, and just in case the reconfiguration fails, substituting a plan segment with a new one synthesized on-the-fly.
The advantages of FLEX-RR are easy to see. First of all, when the reconfiguration succeeds, the agent is able to provide a solution in a very efficient way and the structure of the original plan is preserved within the repaired one. On the other hand, when the deviations on the nominal behavior becomes prominent, the agent can exploit the maximum flexibility provided by a generative approach which can add, remove and change the order of the next actions to execute.
FLEX-RR encodes the DMAP as a CSP, and the replanning by means of a numeric PDDL planner. The CSP solver has been used to ensure to find an alternative allocations of modalities that respects the constraints all along the plan to be executed, while the numerical planner is aimed at finding a new course of actions given the current observed state and the goal of the planning problem at hand.
To evaluate the benefits of the approach, FLEX-RR has been thoroughly experimented in three planning domains, extension of classical numerical domains from the International Planning Competition. The results confirmed the theoretical advantages provided by the DMAP characterization: in repairing partial valid plans, FLEX-RR is more efficient than both a replanning mechanism and the plan-adaptation strategy of LPG-ADAPT [13]; moreover the FLEX-RR repair process keeps the structure of the plan rather stable. More important, given that we used CPU time deadline on the repair resolution task, we measured that FLEX-RR outperformed the other systems even in terms of competence; the gaps on the competence results are very significant in the majority of the (hard) case scenarios.
In future developments of the framework, we will investigate the problem of finding a modality assignment that optimizes a given objective function, such as the maximization of the plan stability or of the resource usage. Also in this case, our hypothesis is that the optimized reconfiguration problem could be easier to solve than replanning optimized w.r.t. a given metric that, as pointed out in [10,16], is undecidable without any restriction on the planning language.
Another direction of our research is to integrate the feedback of the reconfiguration towards the replanning in a more powerful way. Knowing that ReCon was not able to find a solution in the space of modalities could in fact be very helpful to control the search strategy of the replanner since some branches of the search space could be discarded.
Footnotes
Acknowledgements
We would like to thank the anonymous reviewers for the comments that helped to improve the quality of the paper, the Choco’s team for making freely available the CSP solver, Joerg Hoffman for the Metric-FF planning system as well as Alfonso Gerevini, Alessandro Saetti and Ivan Serina for the LPG-ADAPT system.
