Abstract
The Belief-Desire-Intention (BDI) model is a popular approach to design flexible agents. The key ingredient of BDI model, that contributed to concretize behavioral flexibility, is the inclusion of the practical reasoning. On the other hand, researchers signaled some missing flexibility’s ingredient, in BDI model, essentially the lack of learning. Therefore, an extensive research was conducted in order to extend BDI agents with learning. Although this latter body of research is important, the key contribution of BDI model, i.e., practical reasoning, did not receive a sufficient attention. For instance, for performance reasons, some of the concepts included in the BDI model are neglected by BDI architectures. Neglecting these concepts was criticized by some researchers, as the ability of the agent to reason will be limited, which eventually leads to a more or less flexible reasoning, depending on the concepts explicitly included. The current paper aims to stimulate the researchers to re-explore the concretization of practical reasoning in BDI architectures. Concretely, this paper aims to stimulate a critical review of BDI architectures regarding the flexibility, inherent from the practical reasoning, in the context of single agents, situated in an environment which is not associated with uncertainty. Based on this review, we sketch a new orientation and some suggested improvements for the design of BDI agents. Finally, a simple experiment on a specific case study is carried out to evaluate some suggested improvements, namely the contribution of the agent’s “well-informedness” in the enhancement of the behavioral flexibility.
Keywords
Introduction
Artificial intelligence can be defined as a subfield of computer science which aims to construct agents behaving intelligently [99]. The agent paradigm is among the most important and useful abstractions used to design intelligent systems. Various agent models were proposed in the literature, among which is the popular Belief-Desire-Intention (BDI) model, based mainly on the manipulation of the three mental attitudes: beliefs, desires and intentions [28].
The BDI model relies on Bratman’s pivotal philosophical theory [11] which argues the importance of intention in resource-bounded practical reasoning. Practical reasoning is “the process of deciding, moment by moment, which action to perform in the furtherance of our goals” [97]. The classical scheme and concretization of practical reasoning includes two steps [28, 98]: (1) Deliberation: Deciding what goals or desires we want to achieve, with the chosen goals being qualified as intentions that should not be in conflict with each other, and (2) Means-end reasoning: How we are going to achieve these goals. The main mission of deliberation is to ensure that the chosen goal does not conflict with the agent’s focus (intentions), i.e., the chosen goal can be realized in conjunction with agent’s intentions. The abstract and generic design of BDI agents contains data structures explicitly representing the agent’s mental attitudes (beliefs, desires, and intentions) and tasks useful for the concretization of deliberation and means-end reasoning steps. More precisely, it mainly includes the four following tasks [97]: The belief revision, the option generation, the filter, and the action function. On the basis of perceptual input and the agent’s current beliefs, the belief revision function produces a new set of beliefs. The option generation function determines the options (desires) available to the agent, on the basis of its beliefs and intentions. The filter (deliberation function), on the basis of current beliefs, desires, and intentions, determines the agent’s intentions. The action function identifies an action to perform on the basis of current intentions.
Intentions play a central role in practical reasoning, more importantly because they constrain and supervise the agent’s future decisions. Moreover, commitment, which is closely related to the concept of intention, embodies a trade-off between reactivity and goal-directedness. This trade-off is very important for an agent situated in a dynamic environment. A key characteristic of the agent that allows it to cope with the changing nature of the environment is behavioral flexibility. Behavioral flexibility (or adaptability) can be defined as the ability of the agent to change and adapt its behavior according to the situation [80, 100]. The BDI model is an interesting and popular approach to design agents with flexible behaviors [9]. This model has been proposed in response to the agent’s reactive model [15], used especially in navigation problems and the execution of low-level actions [82]. Concerns such as problem solving and commonsense reasoning are not treated in the reactive model [82]. In comparison with traditional planners, the most important advantage of BDI agents is that they can interact with their environment (reactivity), a capability needed to deal with dynamic environments [82]. This interaction is not done on a physical level, as in the case of reactive agents, but on an intentional level. Therefore, BDI agents behave in a flexible way, combining reactivity with reasoning in a natural way [6]. The key ingredient of agent’s BDI model, that contributed to concretize behavioral flexibility, is the inclusion of the practical reasoning capability. This latter ability yields a goal-oriented behavior that is responsive to environment changes [30]. On the other hand, several researchers signaled some missing flexibility’s ingredient, in BDI model, essentially the lack of learning capability (see for instance [2, 37, 83]). In fact, learning allows BDI agent to rectify, refine and improve its available knowledge (plan’s context for instance), in the case of failures. In addition, learning allows acquiring new knowledge in the case of unforeseen situations. In this sense, the learning capability support agent’s practical reasoning with a rectified and an improved knowledge. As a consequence, an extensive research (see for instance [2, 5, 23, 24, 31, 37, 83]) was conducted in order to extend BDI agents with learning. This body of research is very important, as it aims to enrich the BDI agents with an important ingredient for behavioral flexibility. However, the key contribution of BDI model, i.e., practical reasoning was not received a sufficient attention inside agent community. For instance, in many cases, for performance and simplification reasons, some of the concepts and capabilities included in the BDI model (deliberation and goal concepts for instance) are neglected and not explicitly included by BDI architectures.1 Neglecting these concepts was criticized by some researchers [14, 44, 45, 59, 88, 96, 89], as the ability of the agent to reason will be limited. This negligence leads to the fact that the BDI practical reasoning produced at the “architectural” level can be more or less flexible, depending on the concepts explicitly drawn from the “model” level. There are only few works [53, 59, 80] dealing openly and giving an explicit account of the flexibility, inherent from practical reasoning, inside BDI architectures. The current review paper aims to stimulate researchers to re-explore the concretization of practical reasoning in the BDI architectures. Concretely, this paper aims to stimulate a critical review of some representative BDI architectures regarding their behavioral flexibility (inherent from practical reasoning). Furthermore, we explore our problem here in the simple context of single BDI agents (i.e., behaving without a social aspect) situated in an environment which is not associated with uncertainty. Accordingly, BDI architectures treating these two latter aspects are not included in the current review.
After presenting the general structure of a BDI agent (Section 2), this paper starts by discussing the important question concerning the conformity of the current BDI architectures to the features of the BDI model (Section 3.1). The total conformity of a given BDI architecture with the BDI model requires that this architecture includes all the BDI model’s features. The features of the BDI model are considered as the main flexibility requirements for BDI architectures. In Section 3 (precisely Section 3.2), we deeply examine the notion of behavioral flexibility in the previously cited context and propose some related requirements, enriching and refining the main flexibility requirements and existing accounts for flexibility. Based upon a set of proposed flexibility requirements, we analyze some representative BDI architectures (Section 3.3) and suggest some improvements for the design of BDI agents (Section 3.4). In Section 4, an experimentation on a specific case study is carried out to evaluate one suggested improvement for the design of BDI agents. More precisely, we show the contribution of the agent’s “well-informedness” (i.e., the richness of the set of relevant attributes upon which the agent bases its reasoning) in the improvement of the behavioral flexibility. Section 5 compares our suggested orientation against related works. The last section ends with conclusions and perspectives.
General structure of a Belief-Desire-Intention (BDI) agent and related principles
The behavior of a BDI agent is based upon the processing of beliefs, goals, intentions and plans [9]. Beliefs represent information that the agent has about its current environment and itself. Goals represent the world states that the agent wants and desires to bring about. Intentions specify the world states that the agent is committed to carry out. Plans are the means by which the agent realizes its intentions. In the BDI model, plans are typically predefined by the developer of the agent [9, 28].
The BDI agent executes a “Sense-Plan-Act” cycle (or equivalently named BDI cycle, agent reasoning cycle or deliberation cycle) [9]. In the “sense” phase, the agent’s beliefs and goals are updated in response to events. Those events can be either “external events” from the environment (e.g., new top-level (external) goal or new perception) or “internal events” from inside the agent (e.g., new sub-goal) [9]. In the “Plan” phase, the agent deliberates about the goals to be adopted as intentions. More abstractly, the agent in this phase updates its intentions in response to belief and goal change events [9]. Updating the agent’s intentions can give rise to changes in the set of intentions, or to changes to an individual intention in the set [9]. Changes in the set of intentions include for instance adding a new intention when it is created. Concerning change to individual intentions, it includes two kinds of updating [9]; the extension of an intention or the modification of an intention (see Fig. 1). The extension of an intention consists in creating and adding a new subgoal with its plan to realize the parent intention. In case of the modification of a given intention, a plan associated with the intention is replaced by another one, in the situation when the execution of the plan failed [9]. Finally, the last phase in the BDI cycle is the “Act” phase which has as mission to execute and realize the agent’s intentions.
The BDI cycle described above, results in an agent adopting and trying to realize some intentions. Since the concept of “intention” plays a central role in the BDI model, it was characterized by some properties and principles. Here, we mention the main ones. The first main property of intentions is that they are persistent: If the agent adopts an intention, it should persist with this intention and attempt to achieve it. For example, if the agent fails initially to achieve an intention, then it is expected that the agent does not simply give up this intention but rather that it will try again to achieve its intention [10]. The second main property of intentions is the “consistency property”. When the agent has a given intention
The main structure of a BDI agent.
The BDI model is viewed as an approach to obtain agents with flexible behaviors [9]. Therefore, the lack of the conformity of a given BDI architecture with the BDI model (in the sense of not incorporating all the features of the BDI model) will reduce the behavioral flexibility of this BDI architecture. Accordingly, the features and components of the BDI model can be seen as flexibility requirements for the design of agents.
Literature review
The first concretization of the BDI model was PRS by Georgeff and colleagues [33]. Because of the maturity of PRS, it is still actively used [9] and its architecture is reutilized in many other BDI architectures. PRS can be considered as the parent architecture of most of the existing BDI architectures. Each of the BDI architectures that evolved from PRS aimed at treating some issue. The first important issue of PRS is that the goal concept is not explicitly represented despite the BDI agent reasoning being mainly a goal processing. Some authors sorted out this matter by proposing new BDI architectures. For instance, Hindriks and colleagues proposed the GOAL BDI programming language [42], Winikoff and colleagues conceived the CAN BDI programming language [96], while Braubach and colleagues conceived the BDI agent plateform Jadex [12]. The second main issue that motivated the improvement of PRS is that the “consistency property” is not preserved. Among the researchers that treated this issue is Zatelli and colleagues in [101] as well as Braubach and colleagues in the Jadex plateform. The third main problem of PRS concerns the sequential execution of its cycle’s steps. This sequential execution leads to the fact that the agent is not able to detect new events before the current iteration step is finished. This problem causes the reduction of the agent’s reactivity. Some authors treated this problem by using various techniques. For instance, Stabile and colleagues used the “filtering of perceptions” [85], Zhang and Huang introduced the parallel BDI architecture [103], while Koch and Dignum improved PRS with the “parallel BDI execution” principle and the “context observer” module [50].
Despite these important improvements, most of the state of the art BDI architectures neglected some of the features and principles of the BDI model discussed in the previous section. The first neglected feature deals with the activity of updating and generating goals, in the “sense” phase. In fact, there has been only a small number of BDI architectures including this activity (Jadex, and Koch & Dignum’s architecture for instance); it is indeed missed in most of the BDI architectures [9, 28]. The second neglected feature is the explicit representation of intentions and goals. As previously noted, this feature is implicit in the BDI model; since the BDI reasoning is mainly a goal processing, the agent’s goals should be explicitly represented if we want the agent to behave in an appropriate manner. In most of the current BDI architectures, intentions and goals are not explicitly represented, i.e., the agent’s goals are not first class objects [9] (GOAL, CAN and Jadex are among the rare exceptions), which will lead to the difficulty to implement some goals processing such as deliberation about whether to adopt a given goal [9]. The third feature that was not taken into account by the majority of the BDI architectures is the “consistency property of intentions”, with the exception for instance for Jadex and the Zatelli’s PRS improvement. In fact, in most BDI architectures, intentions are adopted without checking if the new intention is in conflict with intentions in execution. The fourth feature neglected only by few BDI architecture (Jadex, and Koch & Dignum’s architecture for instance) is the last property of intentions related to the feasibility property (i.e., the property “intention is in a close relation with beliefs about the future”). In BDI architectures, neglecting this property would lead to an intention being adopted without checking its feasibility in terms of the availability of an applicable plan.
In this section, we have reviewed and discussed different features (properties and components) of the BDI model that have been neglected by BDI architectures. As noted in Section 2, these features can be viewed as flexibility requirements for BDI architectures. Consequently, the incorporation of these missing BDI model’s features in the actual BDI architectures will lead to an enhancement of their behavioral flexibility. The next section (Section 3.2) will enrich the main flexibility requirements discussed herein with other requirements that of course fall squarely within the BDI model and facilitate its concretization. The resulting flexibility criteria will be used in the critical examination of some relevant BDI architectures and to suggest some improvements for the design of BDI agents.
Analysis of the agent’s flexibility notion
To our best knowledge, the first explicit account of behavioral flexibility inherent from the practical reasoning was in the context of ICAGENT work [53, 54]. The authors of ICAGENT propose that to be able to balance between reactive and deliberative behaviors (i.e., flexibility in the agent’s behavior), the agent must satisfy mainly four requirements. First, the agent must decide about the facts and events that should be monitored. Second, the agent must provide an explicit and as detailed as possible representation of features recognized and exploited by it, i.e., features related to the environment, the agent’s mental state, the execution context of the plan, etc. Third, the agent must provide the reasoning tasks needed for exploiting these features (managing its plans, mental state, etc.). To illustrate these tasks, the authors give the example of advanced plan management tasks proposed in [75]. The fourth requirement is that among the reasoning tasks, the agent should decide in which situations it must react or deliberate, balance and intermix reaction and deliberation.
Similar ideas related to the 2
A general definition of the agent’s flexibility is the ability of the agent to change and adapt its behavior according to the situation [80, 100]. A key point about the flexibility is that it is focused on measuring the ability of the agent to respond appropriately to variation in its environment [100]. Flexibility requires the agent to be able to detect the actual situation. In the case of BDI agents, the actual situation is materialized inside the agent in terms of its actual beliefs and goals. So, to attain flexibility, the agent’s architecture should include tasks for updating beliefs set and generating and updating goals set (these two tasks are included in the “sense” phase of the BDI cycle. See Section 2 for more details). These two tasks should monitor the environment at all times to permit the agent to have updated beliefs and goals sets and thus to be permanently informed about the actual situation. To behave flexibly, the agent’s architecture should include, in addition to the two previous tasks, some tasks for adapting the agent’s behavior to the beliefs and goals sets. The BDI agent behaves by pursuing the two main steps of practical reasoning. Accordingly, the agent begins its behavior by selecting the goals that will be realized (deliberation or thinking). After that, the agent proceeds to the realization of the selected goals via plans (acting or means-end reasoning). This acting task can lead to a revision of the executed plans due to some circumstances. The two tasks related to “deliberation” and “plans revision” are included in the “plan” phase of the BDI cycle. As the agent can have multiple goals/plans to execute, the acting task should schedule the execution of the selected goals. Beside the previous requirements and for performance reasons, the agent should take into account two other requirements. First, the agent should be responsive and react to only relevant perceptions, i.e., the agent should focus its perception ability to the relevant aspects of the environment. This latter focus ability is also called “filtering perceptions” (see for instance [84]). Second, the agent should be responsive and react to only relevant changes [50, 51, 75, 91]. Thus, the agent’s architecture should include a task or structure that monitors and detects the relevant events and changes and informs the agent about them.
The two previously cited tasks (filtering perceptions and monitoring of relevant changes) should function permanently and in parallel with the other tasks. In addition, these two requirements are related to the first flexibility requirement, proposed in [53, 54] and cited at the beginning of this section.
In summary, in order to adapt its behavior to beliefs/goals changes, the agent’s architecture should include at least five tasks. The first task is the “perceptions filter” task for filtering the agent’s perceptions. The second task is the “relevant changes’ monitor” or “environment monitoring” task for detecting relevant changes and informing the agent about them. The three remaining tasks are related to the practical reasoning. The third task is related to the “Deliberation” or “Thinking about goals” which deliberates and thinks about goals to be realized. The fourth task is related to the “Acting” or “means-end reasoning” which has as mission the scheduling and the realization of the chosen goals. The last task is the “Plans revision” or “Thinking about plans” which is invoked by the “Acting” task in order to revise executed plans in reaction to some circumstances.
It is well known that the reactivity (responsiveness) of the agent to environment changes and events is an important characteristic for behavioral flexibility [82]. In the BDI agent’s literature, the word “reactivity” is mostly used in relation with re-deliberation (intention reconsideration) and plans revision. In this case, reactivity means that the agent must be responsive to events that occur in its environment, where these events affect either the agent’s goals or the assumptions (pre-conditions) underpinning the procedures (plans) that the agent is executing in order to achieve its goals [97]. In this context, the agent reacts by reconsidering executed intentions and plans. The reactivity in this context is qualified by “higher-level” reactivity as the agent reacts by changing its mind (revising its intentions and executed plans) and not by acting [21]. This higher-level reactivity is concretized by the task of updating intentions/plans, included in the “plan” phase of BDI cycle.
Besides the previously cited use of the word “reactivity”, there is a less frequent one, especially in the context of the BDI architecture ICAGENT [53]. In this latter work, the agent deals with environment’s changes by behaving either in a deliberative or in a reactive manner. More precisely, the agent, in this BDI architecture, can exhibit two kinds of behaviors: deliberation-oriented behavior (the default behavior of the BDI agent) and reactivity-oriented behavior. In the latter behavior, the agent acts without deliberation. In contrast, for deliberation-oriented behavior, the agent acts after deliberation, i.e., acts carefully, by considering and thinking about its options and alternative actions. For BDI agents, we can explain the advantage of adding the reactive behavior to the default one (i.e., deliberative behavior) as follows. In the BDI practical reasoning, the key objective of having a deliberation step before an acting one, is to ensure that the new goal that will be acted on is feasible, in the sense that it has at least an applicable plan (applicability checking) which is not conflicting with the existing intentions (conflict checking).2 If the deliberation detects that the goal is not feasible because of the non existence of applicable plans, then the agent will suspend and delay the acting on this goal. In the case when the deliberation detects that the available applicable plans are in conflict with the current intentions, then for instance the agent will either: (1) delay the acting on our goal, in the case when our goal has a less priority than the conflicting intentions or (2) begin the realization of our goal and at the same time suspend the conflicting intentions, in the case when our goal has a higher priority. As the decision making realized in the deliberation step is a costly reasoning, the ICAGENT agent added another less costly decision level (meta-level decision) above the deliberation level. If at this meta-level decision, we have some indications that ensure that the new goal is feasible, in the sense cited above, then we can act on this goal reactively, i.e., without deliberation (reactivity-oriented behavior). However, if the meta-level decision has not sufficient assurance about the feasibility of our goal, the agent will do more elaborated reasoning (deliberation) to confirm the feasibility of the goal, so to decide about the acting upon that goal. In this latter case, the agent is behaving deliberatively (deliberation-oriented behavior). The originality of ICAGENT is that it integrated naturally the reactivity-oriented behavior in BDI agents, by considering that the difference between this latter new behavior and the default one is in whether the agent acts after deliberating or acts directly without deliberation. In this context, we propose to qualify the reactivity linked to the reactivity-oriented behavior by “lower-level” reactivity by opposition to the “higher-level” reactivity. In ICAGENT, the integration of this new behavior was also done when plans were elaborated. In fact, all plan portions (realizing sub-goals), included in a plan that was selected for execution in a deliberative manner are selected for execution, either in a deliberative or reactive manner,3 depending on the decision of the agent. In this case, the agent is said to “intermix” reactive and deliberative behaviors. In this paper, in order to differentiate this third use of the word “reactivity” with the two previously mentioned ones, we propose to use “plan portion-reactivity”. In the same manner, we use “plan portion-deliberation” to qualify the deliberative behavior concerning a portion or a sub-goal of a plan.
The definition of flexibility that is most frequently used in the agency literature, in the case of single agents, takes into account the necessity of combining both reactivity and pro-activeness behaviors:4 “Flexibility is the capability to handle possibly unexpected events and simultaneously to act with planning and goal orientation” [95]. The trade-off between reactivity and pro-activeness behaviors is a very important aspect that should be taken into account by an agent, especially in dynamic and unpredictable environments. By taking into account the three previously cited uses of the word “reactivity”, in the context of BDI agents, we split this trade-off into three separate parts. The first part concerns the trade-off between lower-level reactive and deliberative behaviors. The second part deals with the trade-off between higher-level reactive behavior (i.e., thinking about goals to be realized and thinking about plans to be executed) and acting task. All the current BDI architectures, except ICAGENT, take only into account this second part, i.e., the two other parts are not considered. The final part is about the trade-off between “plan portion-reactivity” and “plan portion-deliberation”. This balance specifies, for a given situation, if a particular plan portion should be executed in a reactive or deliberative manner.
By taking into account these three cited trade-offs, we propose a refinement of the previous flexibility definition as follows: An agent’s flexibility is the agent’s ability to balance on one hand its lower-level reactive and deliberative behaviors and on the other hand the two deliberative behavior’s sub-tasks: higher-level reactive and acting tasks. Besides, in the case the agent behaves in a deliberative manner, the flexibility in behavior requires that the agent should for each selected plan portion balance between plan portion-reactivity and plan portion-deliberation.
The balance between lower-level reactive and deliberative behaviors and the balance between plan portion-reactivity and plan portion-deliberation is generally achieved by distinguishing the situations where the agent should behave reactively or deliberatively. The balance between higher-level reactive and acting tasks is concretized in practice by the strategy used for the intention reconsideration (i.e., when should the agent reconsider its goals?) and the strategy used for plans revision (when should the agent revise a given executed plan?). The strategies realizing the previously mentioned trade-offs are related to the fourth flexibility requirement proposed in [53] and cited at the beginning of this section.
In order to have a flexible and appropriate intention reconsideration and plans revision strategies, the agent should, among other things, be well informed about the agent’s environment and internal state. The same remark goes to the other cited tasks of the BDI agent’s architecture; for these tasks to behave in a flexible and appropriate way, they should be well informed about the agent’s environment and internal state. In this paper, we propose to use the word “well-informed” to qualify the reasoning and decisions of the agent, more precisely that the agent carries out more informed decisions and reasoning. To attain this, the agent should base its reasoning upon the relevant information and attributes concerning its environment, goals, plans, etc. For instance, the well-informedness of the “deliberation” task requires at least two relevant information concerning the “conflict relation” between goals and the “applicability” of a plan. This well-informedness property is related to the “consistency of intentions” and “feasibility” properties of the BDI model. In addition, it is related to the second flexibility requirement cited at the beginning of this section.
A last requirement for flexibility is related to real-time performance. The agent should respond quickly to circumstance changes and all of its goals should be carried out and achieved at appropriate times. A possible configuration for the agent’s architecture to meet the real-time requirement is to use a “parallel” architecture. The first work that argued the advantage of this aspect for BDI agents was proposed by Zhang and Huang [103, 104]. The main idea behind the parallel BDI architecture is that it is better, from a real-time point of view, to execute the three BDI agent’s cycle steps or tasks (sensing, deliberating and acting) in parallel rather than in sequence. In fact, running the BDI agent’s cycle steps sequentially, means that the agent is not able to detect new events before the current iteration step is finished [49, 103, 104]. Consequently, the reactivity needed in highly dynamic environments is reduced [102] which will eventually diminish the flexibility of the agent.
The following is a summary of the above-discussed minimal set of flexibility requirements for the structure of a BDI architecture. Firstly, the architecture of the BDI agent should include at least the following tasks; (a) tasks for updating beliefs and generating goals, (b) one task (perceptions filter) for focusing the perception and one another (relevant changes’ monitor) to filter changes and informs the agent about the relevant ones, (c) a task for realizing the reactivity-oriented behavior, (d) some tasks for realizing the deliberation-oriented behavior, i.e., deliberation, acting (scheduling and executing intentions), and plans revision, (e) tasks for concretizing the trade-off between lower-level reactive and deliberative behaviors, the trade-off between higher-level reactive and acting tasks (Thinking strategies: intention reconsideration and plans revision), and the trade-off between plan portion-reactivity and plan portion-deliberation. Secondly, all of the agent’s tasks should be well informed about the agent’s environment and internal state. Finally, it is recommended that the architecture be a parallel one to meet real time requirements.
In the next section, we discuss and examine some representative BDI architectures in the light of some of the above criteria of flexibility.
Representative Belief-Desire-Intention (BDI) architectures in context of the proposed flexibility criteria
Most of the existing BDI architectures and platforms are based upon the well-known Procedural Reasoning System (PRS) [33, 43]. In fact, PRS is the 1
Some other BDI architectures have been proposed in the literature in order to extend and improve PRS. In this paper, we focus on some representative architectures (see Table 1) which are: Jadex [12, 13, 14, 73, 74], PRACTIONIST [65, 66, 67], the Stabile’s work on filtering perceptions [84, 85, 71], the parallel BDI architecture [103, 104], the Koch and Dignum’s architecture [49, 50, 51], the ICAGENT [53], the Saadi et al’s architecture [80], the work of Vikhorev [92, 93, 94], the GORITE platform [44, 45, 79] and the work of Zatelli [101].
The Jadex platform addresses the shortcomings of PRS systems by treating the goal as a first class programming concept, more specifically by introducing a life cycle for goals [73]. Therefore, goals are represented and manipulated explicitly, which gives the agent the capability to reason about them. Jadex improves the flexibility of PRS systems further by namely including three characteristics. The first characteristic is the goals generation, permitted by associating a “creation” condition with a goal. The second characteristic is the parallel execution of the architecture’s modules: the three steps of the PRS cycle are carried out independently by three distinct modules (message receiver, dispatcher and scheduler) which operate concurrently on the data-structures of the agent [12]. The third characteristic concerns the inclusion of the deliberation step. The deliberation in Jadex (the so called “easy deliberation strategy”) is a matter of managing the state transitions of goals, i.e., deciding which goals are in the “Active” state (intention) and which are just in the “Option” state (desire) [13]. The deliberation is based on two pieces of information [74]: (1) The cardinality to restrict the number of active goals of a given type and (2) The inhibition arc that expresses a local conflict and precedence between two types of goals. Although this deliberation strategy is a very interesting approach for the agent’s deliberation, it is not sufficiently “well informed”, which would negatively impact on the agent’s flexibility. In fact, the deliberation in Jadex considers only the conflict relation between goals and does not take into account whether the goal that will be activated (intention) has an applicable plan or not. Not considering the feasibility of a goal, more specifically the existence of an applicable and non-conflicting plan, before its activation, may lead to situations where the agent has adopted an intention which has no applicable plan. So, the achievement of the intention will be delayed and suspended until an applicable plan is found. In addition to the previous problem, the deliberation in Jadex does not consider other important factors such as the goal’s urgency that influences the decision about which goals should have the agent’s attention first [103, 104]. Other issues of Jadex can be found in Table 1.
Another platform that tried to improve PRS systems is the PRACTIONIST platform [66]. The PRACTIONIST agents have two important characteristics. The first characteristic concerns the concept of goal. This latter, viewed in this platform as an abstraction to formally define both desires and intentions, is represented and manipulated explicitly by the agent. The second important characteristic is that the deliberation and reasoning about goals is based upon the so called goal model that specifies the structure of goals and the relations among them. Although PRACTIONIST agents are flexible essentially due to their manipulation of an explicit goal representation and a rich taxonomy of goals relations, their reasoning is not sufficiently “well informed”; these agents do not consider important information relevant to deliberation, such as the goal’s urgency. In addition, as in Jadex, the deliberation task does not consider the existence of an applicable plan before the activation of a goal. Another aspect of a PRACTIONIST agent that affects its flexibility is the absence of goal generation. A further issue is that PRACTIONIST agents are not parallel ones. Other issues can be found in Table 1.
Other works in the literature aiming to improve PRS are the Stabile’s work on filtering perceptions [84], the so called “parallel BDI” architecture [104] and the Koch and Dignum’s BDI architecture [50]. These three latter architectures were mainly proposed in order to improve the reactivity of PRS systems. In fact, the loop structure and the sequential execution of the steps of a PRS agent’s cycle introduce a delay between the moment when the observation is made and when it is processed. This unfortunately decreases the level of reactiveness required to support highly dynamic environments [49, 104]. The work on filtering perceptions [84], extended the PRS agent’s cycle with a step to focus the agent’s perception (i.e., filtering the agent’s perceptions). Although there are some other earlier PRS’s extensions with the ability to filter perceptions (see for instance [60]), the work of Stabile is an interesting one, as it is the first one that concretized this ability in BDI, via a special function or module “perception filter”. The idea around perception filters has been proposed for the first time in [8], outside BDI context. In the work of Stabile, the new added step aims to reduce the number of processed perceptions, so to reduce the global time needed for the agent’s cycle. Although it has been shown that this improved PRS agent’s cycle leads to a better reactivity than the classical cycle (see [84] for more details about the experiments done in the field of embedded robotic agents, using Jason platform), the main problem impacting the reactivity (i.e., the delay between the moment of observation and the moment of processing) still exists. The second work, aiming at improving the reactivity of PRS, is the “parallel BDI” architecture [104]. This latter architecture improves the reactivity of PRS agents by paralleling the execution of the various tasks.6 Recent work by Zatelli [102] concretized this idea about parallel BDI agents in the Jason BDI platform. The architecture of Zhang and Huang is a very interesting one for the improvement of the BDI agent’s reactivity and hence its flexibility. However, some well-informedness related issues can be signaled, especially concerning the plan selection step. As in PRS systems, plans are adopted for execution without considering eventual conflicts with the plans in execution. Another drawback is the fact that although the agent reasoning manipulates urgency, the semantic proposed for urgency is not clear as it is not related to the deadline notion. Other characteristics and limitations are indicated in Table 1. The third BDI architecture, aiming at improving the reactivity of PRS, was proposed by Koch [49, 50]. Koch adopted the “parallel execution” principle and introduced a module (context observer), specialized in observing the environment. This module works by observing changes in the environment, rationalizing the changes that are relevant to the current processing and sending control events to the planning threads.7 In this architecture, goals are associated with conditions; in the case of desires these conditions are “creation” conditions, whereas for intentions, these conditions are “termination” conditions. In Koch and Dignum’s architecture, a relevant event is a belief base’s update related to a sub-condition included in a goal’s condition. The context observer bases its decision about the relevance of a given belief change on some “impact” function. The impact function, based on the “window of opportunity” concept, helps the context observer to detect belief changes that may influence the truth value of conditions associated with goals (see [49, 50] for more details). A relevant event generated by the “context observer” leads the deliberation module to verify the truth value of the associated goal’s condition. The verification of this condition can lead either to starting the goal, pausing, resuming or terminating the goal processing. The Koch and Dignum’s architecture is a very interesting approach for designing BDI agents with further flexibility. We believe that its most important contribution to the BDI agent literature is the proposition of an approach and a structure that implements efficiently the “Relevant changes’ monitor” task. However, this architecture presents some limitations regarding its reasoning not being sufficiently well-informed. First, events for which the agent revises its current processing concern only the condition of the goal and do not consider other relevant information related to the plan’s applicability. A plan in execution that turns out to be inapplicable in the current situation should lead the deliberation module to pause the processing of the associated intention, if there is no other alternative applicable plan. The second limitation is that the deliberation task, as in Jadex and PRACTIONIST, does not consider the existence of an applicable plan before the activation of a goal. The third limitation is that although the deliberation module manipulates priority relation between goals, it would be preferable to refine priority in terms of more concrete information such as importance and urgency.
To our best knowledge, the first BDI architecture that was proposed explicitly to treat the flexibility issue is ICAGENT [53, 54]. This architecture is not based on the PRS system but on the collaborative planning framework [36] which is extended with advanced plan management tasks [75]. ICAGENT has two main characteristics. The first characteristic is that ICAGENT uses an explicit representation of mental attitudes and features which are included in a knowledge base (KB). As an example of these features we mention: facts, the agent’s mental attitudes (beliefs, desires and intentions), the context of action (plans in execution, constraints and conditions associated with these plans, etc.), recipes (plans that the agent is able to execute), and situation rules (used to generate desires). The second main characteristic is that the ICAGENT architecture contains the following modules: Perception module, opportunities and situation recognition (goals generation), reconciliation (deliberation), plan elaboration and intention realization. In addition, KB is updated and consulted via a special module which ensures KB’s consistency as well as the parallel and coordinated run of the modules.
The key idea, upon which ICAGENT is based, is that the distinction between deliberation oriented behavior and reaction oriented behavior does not rely on whether an agent plans or invokes a hard-wired procedure, but on how plans (or plan portions) are manipulated (reactively or deliberatively) [53]. ICAGENT agents use the “behavioral mode” attribute of the situation rule and the plan, namely the “BMntlcond” condition, to decide about the way the plan (or plan portion) shall be utilized. Thus for each plan that the agent selects from the KB, it has to decide whether it will deliberate about it before starting its elaboration/execution or start its realization directly. ICAGENT agents are flexible essentially due to five characteristics. First, the filtering of perceptions and the environment monitoring are taken into account by the perception module. Second, goals generation is permitted by the use of situation rules. The parallel running of the architecture’s modules is the important third characteristic. The fourth characteristic is the intermixing between deliberation and reaction, by distinguishing, via the “behavioral mode” attribute “BMntlcond”, plans/plan portions that are generated in a reactive or deliberative way. The fifth characteristic concerns the fact that deliberation of a given plan (and thus the associated goal) is based upon the so called “list of check directives”. These directives specify the agent’s features, including the plan’s effect, plan’s precondition and the agent’s mental state, that must be checked by the deliberation module for the occurrence of possible conflicts. This list of features is not fixed; it is rather determined at run time for each selected plan, specifically during the test of the “BMntlcond” condition. In addition, before adopting a goal for execution, the agent checks for its feasibility, i.e., it checks for the existence of an applicable and non-conflicting plan.
ICAGENT is a very interesting approach to design BDI agents; however, it does suffer some drawbacks. First, although the two capabilities of “filtering perceptions” and “environment monitoring” are taken into account by the perception module, no details were given on how they should be realized. Second, the reasoning of ICAGENT is not sufficiently “well informed”, since it does not consider important features relevant to the agent’s reasoning such as the goal’s urgency.
The second work found in the BDI literature that was explicitly proposed to treat the flexibility issue is the Saadi et al’s architecture [80]. According to this work, the first step towards further flexible reasoning in BDI architectures, is to consider the different attributes useful for the decision making process. This architecture is based upon a rich set of attributes that concerns fundamentally the characteristics of the goal such as urgency and importance. These attributes, are inspired essentially from two independent works [7, 69]. Besides, the agent’s reasoning takes into account the fact that a given issue or problem can be resolved by some alternative goals. So, in Saadi’s architecture, the ability to reason about alternative solutions is exploited not only at the plan-level but also at the goal-level. The information concerning alternative goals is relevant and important for the agent to reason in an appropriate manner. Furthermore, before adopting a goal for execution, this architecture checks for its feasibility, i.e., it verifies the existence of an applicable and non-conflicting plan. The idea behind this architecture is interesting for the design of BDI agents. However, two main limitations should be noted. First, the process of goal generation was neither detailed nor specified [80]. Second, this architecture includes a set of agent’s motives. It is well-accepted that motives are important components of the agent’s internal state and they are relevant to the agent’s reasoning and goal generation.8 In addition, it is argued that the flexibility in reasoning can be achieved through the use of motivation [47, 61]. However, the concept of motive originally does not exist inside the BDI model and the description of Saadi et al’s BDI architecture (in [80]) does not give any clarification on how motives can be incorporated naturally inside BDI agents. The last issue is related to the agent’s motivation, an important concept for the well-informedness of the agent. Further features and limitations of this work can be found in Table 1.
In the BDI agents’ literature, some others architectures were proposed in order to improve the well-informedness of PRS. We mention for instance the work of Vikhorev [92, 93, 94], in which the BDI agent’s reasoning is based essentially upon the goal’s deadline and importance attributes. To note, these attributes are also manipulated by Saadi’s architecture but exploited by different steps of the reasoning. More precisely, in the work of Vikhorev, the processing of such attributes is exploited by the execution step of the agent’s cycle, more specifically by intentions scheduling. However, in the work of Saadi, the processing of such attributes is not exploited for the intentions scheduling but are rather manipulated by the deliberation task. The work of Vikhorev on one hand and the work of Saadi on the other are complementary, in the sense that they both try to concertize two different steps in the BDI agent’s reasoning, using the same kind of attributes. Another characteristic of the architecture of Vikhorev is that goals are represented explicitly (included in goal stack). However, this explicit representation is taken into account only by the plan selection step and not exploited to provide more elaborated goal processing such as deliberation. Other characteristics and issues can be found in Table 1.
Finally, two recent works were proposed in order to improve the well-informedness of PRS. The first one is the BDI framework: GORITE [44, 45, 79], which proposed a unified framework for managing both individual and team goals. Although the cooperative and social behavior is outside the scope of the present paper, an interesting relevant characteristic worth mentioning here is that GORITE’s authors argue that in order to support intention management and manipulation, a BDI platform should explicitly represent and manipulate intentions. In order to satisfy this requirement, authors of GORITE argued that an alternative BDI execution model is needed where agents delegate the management of their behaviors to an executor object; the executor is responsible for initiating goal executions on behalf of the agent. In this model, agents are still able to choose between courses of action to achieve a goal or to reconsider how a goal might be achieved. The executor object runs in separate threads which favors the parallel execution. Although GORITE represents a goal explicitly, goal reasoning and processing was focused only on the intentions scheduling and execution, which is the final step of the BDI agent reasoning. The deliberation task is not present, which leads the agent to adopt a goal for execution without checking if it is conflicting with the actual intentions. Another well-informedness issue is to do with the intention scheduling realized by the executor; although it manipulates priority relation between intentions, it would be preferable to refine priority in terms of more concrete information such as importance and urgency. The second recent work, which is an improved version of PRS, was proposed by Zatelli et al. [101]. This architecture mainly extended the plan’s applicability checking step of plan selection with another step for checking its executability. In this case, a plan is said to be executable if it is non-conflicting with intentions in execution. The main information exploited by this executability check (conflict checking) is
Comparison of some representative BDI architectures according to some flexibility criteria. Cross (
) indicates satisfaction of the specific criterion
Comparison of some representative BDI architectures according to some flexibility criteria. Cross (
the conflict relation between plans. The advantage of adding the executability checking step is to enforce the feasibility checking of the selected intention. An intention can immediately start its execution instead of getting suspended due to the lack of capability of an agent to adopt an alternative plan when conflicts are detected [101]. Apart from the latter particular advantage of this extended PRS, Zatelli’s architecture inherited the remaining limitations of the original PRS (see Table 1 for more details).
Following the above detailed review of some relevant BDI architectures, we suggest herein some key overlooked flexibility’s requirements that would be interesting to consider in actual BDI architectures in order to further improve their behavioral flexibility.
Beginning by the well-informedness requirement, we discuss five key relevant points, of which the two first have led to a divergence among researchers. The first point deals with how the agent’s goals are represented. Although the necessity of explicit representation and manipulation of goals is recognized by several works, there is no consensus about how goals are represented. However, there is an interesting recent emerging trend that views a goal as an object having a life cycle. This emerging trend is supported by an extensive research, including formal and logical languages [38, 39, 40, 56, 72, 86, 87, 90], the philosophical and theoretical work about intention, proposed by Castelfranchi [17, 19, 20, 21, 22], some recent applications in the field of robotics and autonomous vehicles [1, 26, 27, 48, 77], and finally a small number of BDI architectures (e.g., the previously reviewed Jadex, Parallel BDI and Saadi’s architectures). In this research stream, the notions of ‘desire’ and ‘intention’ do not indicate two separate cognitive primitives, but rather goals at different stages of processing. We argue as Castelfranchi suggested in his work [22] that viewing goals as objects with life cycles not only simplifies but also contributes to a better understanding of the goal management and processing. Besides, defining intermediate attitudes (towards a goal) between “desire” and “intention” allows the agent’s deliberation-oriented behavior, concerning each goal, to switch between many agent attitudes. We believe that this last property gives an increased flexibility in behavior in comparison with agents having no intermediate attitudes between desires and intentions. So, it would be interesting to adopt this point of view to represent goals in BDI architectures.
The second well-informedness related point on which there is no consensus, in the BDI literature, concerns the goal’s feasibility checking. The goal’s feasibility checking should be done before adopting a goal for execution, i.e., as intention [3, 4]. The checking of the goal’s feasibility includes two steps: (1) Checking for the existence of an applicable plan for that goal and (2) Checking that this applicable plan is non-conflicting and can be achieved in conjunction with the actual intentions. In the case of the existence of applicable plans for our goal (1
The first feasibility checking strategy, focused on the plan’s applicability, is applied by BDI architectures where there is no deliberation task (as in PRS system), whereas the second and third strategies are used by BDI architectures including the deliberation task. Besides, the second feasibility strategy is inherited from the classical scheme and concretization of Practical Reasoning (PR), where the means-end reasoning (searching for plans to achieve the selected goals) takes place after the deliberation step. However, some authors [3, 4] in the field of argumentation reasoning, where they tried to formalize PR, have indicated some issues related to this PR’s classical scheme. They noticed that following this scheme, the agent can choose an intention which is unfeasible, i.e., for which no applicable plan can be formed, and in doing so the deliberation task might exclude some options which could be carried out. So, in order to avoid this lack of feasibility checking, Amgoud and colleagues proposed another decomposition of PR, where the means-end reasoning step does not take place after the deliberation step, but rather before. This enhanced PR scheme enables the deliberation step to include a complete goal’s feasibility checking (by taking advantage of means-end reasoning), and so to take more informed decisions. The third feasibility strategy cited above conforms to this enhanced PR scheme. It is clear that it would be practical for the BDI agent reasoning to adopt this new PR scheme, as this would allow the improvement of the well-informedness of the deliberation task.
The third point related to well-informedness concerns the goal’s urgency attribute. Although this latter attribute is relevant to the goal management and processing, only a very small number of BDI architectures include it in their reasoning. In the discussed works, only two architectures (Vikhrev’s and Saadi’s) manipulate this relevant information.
The fourth well-informedness related point is to do with information about goal alternatives. Among the discussed works, Saadi’s architecture is the only one that manipulates this relevant attribute (see Table 1). Accordingly, this particular architecture is interesting in relation to the well-informedness as it is unique in combining all of the four previously recommended points, namely goal representation as “object with life cycle”, the 3
The last well-informedness related point, not included by any of the actual BDI architectures, concerns motives. Motives are important and relevant for an agent and its reasoning. In general, it is argued that flexibility in reasoning can be achieved through the use of motivation [47, 61]. In fact, among the added values of including motives in the agent’s design, we mention the three following benefits (For more details about motives see for instance [25, 47, 64, 70]). The first added value is the support of goal generation: The motivation (intensity of motive) can be used to quantify relevant changes for which the agent should generate goals. The second added value is the support of deliberation. A motivation can be used as a heuristic in the selection of goals (deliberation). In fact, the motivation that the agent has towards achieving a goal is a measure of the priority of the goal for the agent at that particular time. The third added value is the support of intention scheduling. A motivation can serve to drive the execution of a plan to satisfy the motivated goal in the required conditions and at the right moment. Hence, motivation can be a heuristic tool in the execution prioritization of different plans.
The standard method to extend the BDI model with motives is to create a supplementary agent’s component for motives in addition to the existing ones, i.e., beliefs and goals (see [64] for an overview). However, the resulting agent’s model cannot technically be qualified as a BDI one since a motive component does not originally exist in the BDI model. From the evident relation and similitude between the two concepts of “motive” and “maintain goal”,10 it is astonishing that there is no research done on how the motive concept can be integrated naturally into BDI agents, except for the newest work in [81]. This latter work showed how to express motive concept in terms of goal concept, particularly, in terms of “maintain” goals. This means that to include motives in BDI agents, no additional components are necessary, in addition to the existing ones (beliefs and goals) [81]. A “maintain” goal (or equivalently named maintenance goal) is a particular goal category whose aim is to ensure that some world state holds and continues to hold [14, 28]. So, the maintain goal is associated with some condition that is to be maintained. Both the “maintain goal” and “motive” monitor the internal/external state of the agent for relevant changes and may accordingly produce goals [81]. In the case of motive, a relevant change is quantified by the motive’s intensity (motivation) which is above a given threshold and for which a goal is generated to mitigate this intensity. In the case of a maintain goal, a relevant change is, for example, the violation of the maintained condition for which a goal will be generated to recover this condition. It is clear that we can concretize the fact that the motive’s intensity is above a given threshold by using the truth value of the condition to be maintained. In this way motives can be included naturally inside BDI agents by using the concept of goal. It is worth mentioning that the previously discussed representation of a goal, as an object with a life cycle, simplifies the inclusion and the processing of the various goal categories inside BDI agents (see [14] for an overview). So, it is clear that adopting the “goal as object with life cycle” view favors and simplifies the natural inclusion of motives inside BDI agents.
Beside the well-informedness related points discussed above, four other key features, overlooked by the majority of the discussed BDI architectures, would be worth including in order to further enhance the agent’s behavioral flexibility. The first key feature is the balance between the lower-level reactive and deliberative behaviors, taken into account only by ICAGENT. The majority of the discussed architectures behave according to only one of the two behavioral modes: lower-level reactivity mode (there is no deliberation) or deliberative behavior mode; these two modes do not coexist in these architectures. The second key feature concerns the balance between “plan-portion” reactivity and “plan-portion” deliberation which is only taken into account by ICAGENT. The third key feature is related to the “Relevant changes’ monitor” task that was realized only and efficiently in Koch and Dignum’s architecture, using impact function and the concept of window of opportunity. The fourth key feature concerns the “perceptions filter” that was included only in the BDI architecture presented in the work of Stabile [84].
In addition, the other flexibility criteria, such as the goal generation and the parallel architecture, are considered only in 6 architectures among the 11 discussed ones (see Table 1). So, it would be advantageous to include these characteristics in BDI architectures.
In order to gain insight into how to obtain a BDI architecture that includes all the above discussed key points, we can inspire from the works reviewed in this paper. In addition, we can include naturally motives in the BDI architecture, by using the idea proposed in [81], i.e., by expressing motives in terms of “maintain goals”. So, concretely motives can be included in the BDI agent’s goals set, as maintain goals (see Fig. 2). Below, we briefly recall the role of the different characteristics and additional capabilities that would be worth including in actual BDI architectures. This will helps us to put them in the right position inside the architecture and to have an idea about their interactions.
First, as it was discussed previously, it is recommended that the BDI architecture be a parallel one. Therefore, the different modules of the architecture run in parallel and can interrupt each other and communicate via events.
Second, the Perceptions filter (PF) module has as mission to focus the agent’s perception. Its effect is to reduce the amount of perceptions to be processed by the Beliefs Revision and Update (BRU) module, so to improve its performance (see Fig. 2). The evident use of PF module, in the BDI architecture, is to inject a filtered perceptions list to the BRU module. The PF approach proposed in [84, 85] is very interesting and worthy to inspire from.
Third, the Environment Monitoring (EM) (or “Relevant changes’ monitor”) module has as mission to monitor belief changes and informs the other modules and components, concretizing the agent’s practical reasoning, about the relevant ones. For a given module, a belief change is qualified as relevant if for instance it may impact the truth value of some conditions, influencing decisions taken by this module. As an example of influent conditions, for the decisions taken by deliberation, various conditions attached to goals and plans (For instance see [49, 50] for more details). Another example of influent conditions, concerns motives. In fact, the behavior of motive is influenced by the condition about its intensity: if the intensity is above some threshold, the motive will cause the generation of some goal in order to mitigate the intensity. The added value of the EM module for the BDI architecture, is that it allows the module for which the relevant belief changes are monitored, to verify the truth value of its influent conditions only for relevant belief changes and not for every belief changes. In fact, verifying conditions for every belief changes is a costly operation (for more details see the work of Koch [49, 50]). From the role of EM module, its evident position in the architecture is with the modules interfacing the environment with the agent, i.e., with PF and BRU modules (see Fig. 2). The EM approach presented by Koch [50] is a very interesting and efficient one, so it would be worthy to inspire from, especially the method to estimate the impact of a given belief change on the truth value of a given influent condition (method based on the concept of “windows of opportunity”). The last point concerns the relation that should exist between the two modules PF and EM. As the EM module monitors the belief changes that are relevant to the agent’s practical reasoning and as the agent’s beliefs are directly influenced by the agent’s perceptions, it would be interesting (in order to reduce the number of belief changes, processed by the EM module) that the PF focus only on perceptions related to relevant belief changes monitored by EM.
Fourth, agent’s motives are among the important components of the architecture that would be interesting to monitor for them relevant belief changes. If the EM module notices a belief change, which is relevant to a given motive, it will immediately inform it, for instance via an event (see Fig. 2). In response, the concerned motive verifies the condition concerning its intensity, i.e., if the intensity is above a given threshold. If it is the case, the motive will cause the generation of a goal, aiming to mitigate the intensity. This can be done, for instance, by informing the module responsible for the “goals generation”, via an event, to generate a given goal (see the module GG in Fig. 2). The generated goal, which is in general, either an achievement or perform goal,11 is placed in the agent’s goal set. Now, at this level, we should concretize the trade-off between lower-level reactive and deliberative behaviors, i.e., whether the new generated goal should be processed in a reactive or deliberative manner (see Section 3.2 for more details). Therefore, it would be interesting to include a module to take this decision (For instance the module “Meta-Level Decision”(MD) in Fig. 2). For the method used to realize this module, we can gain insight from the ICAGENT work [53]. If the MD module decides that the new goal should be acted on deliberatively then it will ask the “Deliberation” module to deliberate about the goal, for instance via an event (see the event “deliberative acting” in Fig. 2). If the goal passes successfully the “deliberation” step, the agent should commit to realize it, i.e., moves it to the “intention” level”. Due to the advantages of the goal representation as “object with life cycle” (For more details, see the beginning paragraph of this section), we recommend to use it in the BDI architecture. With this goal representation, committing to a given goal means simply moving it from an initial state to another state (“active” state for instance) where the agent intends to realize it. In order to concretize this “goals as object with life cycle” point of view, we can add a module, to the architecture, which will be responsible for managing and changing goals’ states in response to some relevant events (transmitted from the other modules). For instance, when the deliberation module decides that the agent should realize a goal, it will ask the Goal State Manager (GSM) module to move it to the state corresponding to the “intention” level (“active” state for instance). Similar modules to GSM were used in the two reviewed architectures: Jadex [12] and Saadi et al’s architecture [80].
Fifth, the deliberation module is another component of the architecture, in addition to motives, that is interesting to monitor for it relevant belief changes. If the EM module notices a belief change, relevant to the deliberation module, it will immediately inform it, for instance via an event (see Fig. 2). As a first example of relevant belief change, the one related to the fact that an executed plan becomes non applicable, i.e., the associated precondition is unsatisfied. In this case the deliberation module tries to check if the associated intention is still feasible, i.e., if it has another applicable and non-conflicting plan. In this case, the deliberation module treats four situations. Situation 1 occurs if there is an alternative plan, which is applicable and non-conflicting. In this case, the deliberation module asks the Acting module to revise the plan in question (see the event “Revise Plan” in Fig. 2). Situation 2 is when our executed intention has no other applicable plan; the deliberation module decides to suspend the intention (until an applicable plan is available). This latter decision can be concretized, for instance by requesting the Acting module to “stop” the execution of our non-applicable plan and at the same time to ask the GSM to move the associated goal to the “suspended” state. Situation 3 is the case when our intention has applicable plans but these plans are conflicting with some other executed intentions. In addition, our intention has a higher priority than the conflicting intentions. In this circumstance, the deliberation can decide to suspend those latter conflicting intentions. This decision can be concretized by requesting the Acting module to revise the actual plan in execution and at the same time asks the GSM to move the conflicting intentions to the “suspended” state. Situation 4 is when our intention has applicable plans which are conflicting with some other executed intentions. In addition, our intention has a lesser priority than the conflicting intentions. In this case, the deliberation decides and does the same thing as in the previous case where the intention has no applicable plans (i.e., as situation 1).
An abstract BDI architecture for further behavioral flexibility.
A second example of belief change, relevant for deliberation, is the one related to the fact that a given condition, associated with a current intention, is not satisfied. Among those conditions, we have the context condition (for more details about this condition see for instance [14, 49]). If the context condition of a goal is not satisfied, the deliberation decides to suspend the goal in question, until the truth value of the condition becomes true. This decision can be concretized, as in earlier cases, by asking the Acting and GSM modules to respectively stop the plan associated with the goal and to move this latter goal to the “suspended” state.
The two previously discussed examples represent two situations when the agent should revise its executed plans and intentions. In the architecture of Fig. 2, as in the Koch work, the component playing a central role in determining the moment of intention reconsideration and plan revision is the EM module. In other words, this latter module can be used in the implementation of the balance between the higher level reactivity and acting.
Sixth, the balance between “plan-portion” reactivity and “plan-portion” deliberation can be concretized, as in the ICAGENT architecture, in the same manner as the balance between the lower-level reactive and deliberative behaviors. In fact, a subgoal included in a plan is treated, by the MD module, as any other new goal.
Seventh, priority relations between goals are in general used by the BDI reasoning to resolve conflict situations between goals, and to schedule intentions (intentions scheduling). However, in order for the agent’s reasoning to be more informed, it is preferable, as it was done in the work of Vikhorev and the work of Saadi, to refine this priority relation in terms of more concrete information like urgency and importance. The evident relation between the intensity of a motive and the priority of a goal that was generated from it, was exploited by some works to propose methods to calculate the goal’s priority (see for instance [68]) and some others to propose a calculation of a refinement of priority such as urgency (see for instance [80]). So, it will be advantageous to gain insight from this relation to calculate the goal’s priority in general and its refinement in terms of urgency and importance.
Eighth, as done in the Saadi et al’s architecture, the information concerning alternative goals is integrated in the proposed architecture by considering the fact that each agent’s motive is associated with a set of alternative goals that can be generated from it, i.e., used to mitigate its intensity.
Finally, we conclude this discussion with an important question regarding the real necessity for a BDI architecture to have an optimal coverage of all the previously cited flexibility requirements. It is evident that such coverage would be ideal to further improve the behavioral flexibility of a BDI architecture. However, one could argue that programmers implement BDI agents according to the needs of their applications, and hence missing some of the recommended capabilities is acceptable and more practical. This argument is of course true to some extent, especially in the case of a developer who carries out the hard work of implementing all the functionalities of a BDI architecture from A to Z. However, the situation is different when an alternative approach of programming is employed; whereby (more of) the hard work and low-level functionalities are supported by agent programing platforms/languages [9, 57, 58, 59, 63]. In this latter case, the role of developers shifts from programming low-level functionalities to identifying declarative knowledge (preconditions of plans for instance). Besides, it is recommended, as is the case for the widely used programming languages, that agent programming platforms/languages provide support for many more features than will be used in any particular application [57, 58, 59]. Consequently, it is clearly advantageous for the architecture of a BDI agent to cover as many flexibility requirements as possible.
Presentation of the case study
The case study involves the simulation of a domestic robot or agent that the owner (person with special needs) can commands by a specific board/interface (see Fig. 3). In the interface, the owner can insert three kinds of commands. The first command “Go to bank” (goal 1) aims to take some money for his owner (To allow the owner pay rent by cash to the landlord). The second command “Go to supermarket” (goal g2) aims to do some shopping for his owner (The shopping is done via a credit card). The third command “Come back home for a task of great importance” (goal g3) aims for instance to give medicines to his owner.
The interface of the application used for the experimentation.
The board of commands allows the owner to insert a set of commands or tasks to be executed by the agent. In addition, it allows to insert a command during the execution of some other commands.
The set of requested commands is viewed by the agent as a set of “external” goals.12 The commands’ board automatically inserts (without the owner’s intervention) at the end of the requested commands list, the following particular command: “Come back home (goal g4)” that signifies for the agent to come back home, not for a great importance but just to bring back the money or what it bought to his owner.
In this case study, the environment of the agent consists of only two possible destinations: the “bank” and the “supermarket”. Each of these destinations is linked to the owner’s “home” (associated with a coordinate of (0, 0)) by a unique principal way (see Fig. 3). To join a specific destination, the agent should follow the associated (principal) way. If the agent is not positioned in the way in question, it should first join it before following it towards the destination. For the achievement of the two come backs’ commands (goals g3 and g4), the agent should follow (towards the home) the last (principal) way that the agent borrowed or had an intention to borrow. The four goals g1, g2, g3 and g4 are assumed to be in conflict, i.e., cannot be realized in parallel.
Besides, we assume that each goal g
The action “Join _way”: it has no precondition and permits to join, from the actual agent’s position, the principal way leading to the destination. The action “Follow _way”: Before beginning this action, the agent should be positioned in the principal way in question. This action permits the agent to move along the corresponding way towards the destination. The action “Finish_plan”: This action is executed to conclude the realization of the goal associated with the plan. It is essentially used to update some agent’s beliefs.
The utility U
Utilities and costs associated with goals
It is worth noting that the cost of a plan p
Concerning the normalization of costs (i.e., transforming costs into [0, 1] scale), it was carried out by dividing costs by 152. In the case study, we assume that the agent is equipped with a battery that provides it with energy and 152 is assumed to be the number of squares that the agent can travel before its energy is out. In other words, the agent’s battery, which is fully charged, allows the agent to travel 152 squares (issues concerning the management of the agent’s battery and maintaining its energy at an acceptable level are not treated in this case study).
The net utility (importance) of a goal g
A last point about goals g
The interval [t The interval [t
In the case study, the used temporal unit (t.u) is second. Besides, for a given goal g
As time is important, the application is equipped with a particular module playing the role of a clock. This clock provides in permanence the time for all the application’s modules. Finally, the case study was implemented in Java (Eclipse IDE), using Windows XP pack 2 and a laptop Acer (9 inch, Intel Atom 1.6 GHz, RAM of 500 Mo and hard disk of 8 Go). The agent’s modules as well as all the other application’s modules (the application’s clock, agent’s environment, etc.) have been implemented as threads to allow the parallel execution. The agent’s data structures (plans, goals, etc) were implemented as objects.
Goal’s temporal parameters
t.u: temporal unit.
The objective of the experimentation is to show the contribution of the agent’s well-informedness (i.e., the richness of the set of relevant attributes upon which the agent’s reasoning is based) in the improvement of the exactness of the agent’s behavior and so of its flexibility. In fact, a key point about the behavioural flexibility is that it is influenced by the exactness and suitability of the agent’s reaction to changes [100]. An agent that reacts incorrectly, irrationally and unsuitably to changes should not be considered as flexible. In this experimentation, we will focus on the usefulness of the two attributes: the estimated global cost for realizing goals combined with goal’s urgency and will show their contribution in the improvement of behavioral flexibility. It is evident that considering urgency (and therefore deadline) in the reasoning allows the agent to reduce the situations where the realization of tasks exceeds their deadlines. The experimentation will not treat the previously evident utility of urgency attribute but will treat other usefulness especially its exploitation, by the deliberation module, for the estimation of the global cost for realizing goals associated with temporal attributes. In the context of the presented case study, we experiment three variants of the agent’s internal structure which are all based on the Saadi et al. architecture [80]. The three variants had been obtained by varying slightly the deliberation algorithm and the used relevant attributes of the Saadi et al. architecture.14
In the 1
In the 2
In the 3
In the case study, as the agent (i.e., the domestic robot) adopts and receives goals from an external source (i.e., robot’s owner), the three previous variants extend the Saadi et al. architecture with consideration of “external” goals. In fact, “internal goals” in the Saadi et al. architecture are only generated to satisfy agent’s needs (i.e., motives): they are not generated to serve external goals. In this experimentation, in order to extend the “goal generation” task with consideration of external goals, we have chosen the “internalization” technique which simply consists in adding and “copying” the received external goal into the set of agent’s goals (for more details about the internalization of external goals, see [18]: Section 4.1.4, page 379).
The principal situation that we are interested-in in this experimentation is that assume that during the period that our domestic robot is realizing the goal g2 (Go to supermarket) it receives another goal (from its owner) which is more important but not urgent (i.e., urgency event was not triggered). The question here is how should the robot deal with this situation? Should it interrupt temporary the execution of g2 in favor of the received more important and non-urgent goal? Or should it treat the new requested goal after the termination of g2? What is the rational decision and behavior for this situation? The criterion of rationality, used for the experimentation is searching to optimize the total cost for realizing agent’s goals. The decision to “interrupt” or to “not interrupt” g2 should optimize this total cost. The cost is quantified in terms of the number of travelled squares.
In the following section, we give the scenarios that we have experimented and the results obtained in the case of each of the above three variants.
Experimented scenarios and obtained results
In conformity with the principal situation that we are interested-in (see the end of the previous section), we have experimented four scenarios A, B, C and D.
In the case of the “scenario A”, the command corresponding to g2 (Go to supermarket) is transmitted by the owner 30 t.u after the starting of the application. The command corresponding to g1 (Go to Bank), more important than g2, is transmitted to the agent after having progressed 6 squares of the path leading to the supermarket. The “scenario B” is the same as scenario A, except that the command corresponding to g1 is transmitted at the moment when only 6 squares remain for the agent to reach towards the supermarket. In the case of the “scenario C”, the command corresponding to g2 (Go to supermarket) is transmitted by the owner 30 t.u after the starting of the application. The command corresponding to g3 (Come back home for a task of great importance), more important than g2, is transmitted at the moment when it remains for the agent 7 squares to progress towards the supermarket. Concerning the “scenario D”, it is the same as scenario C, except for the command corresponding to g3 being transmitted at the moment when the agent has made only 7 squares of the path leading to the supermarket. For this scenario, we assume that the attribute t
In all of the four previously cited scenarios, the most important goal (g1 for scenarios A and B and g3 for scenarios C and D) arrives with no urgency.
The previously cited scenarios were chosen by varying two criteria. The first criterion concerns the moment where the command, corresponding to the most important goal, is transmitted (at the beginning or at the end of the realization of the goal in execution). For scenario A and D, the most important goal is transmitted at the beginning of the realization of g2. For scenario B and C, the most important goal is transmitted at the end of the realization of g2. The second criterion is related to the triggering or not of the urgency of the most important goal during the execution of the current active goal. For scenario A and C, the urgency is triggered during the realization of g2. For scenario B and D, during the realization of g2, the urgency is not triggered.
According to the deliberation principle used by each agent’s variant (see Section 4.2 for more details) and facing the previous scenarios, the agent will behave as follows.
In the case of the 1
Table 4 gives the results obtained from the experimentation. We notice three points. First, the agent using the 1
Results of the experimentation
Results of the experimentation
According to Table 4, the appropriate behavior (i.e., optimizing the number of travelled squares), in case of scenario A and C, would be to interrupt the active goal g2 by the non-urgent and most important goal. This explains why the 1
Estimations realized by the agent’s 3
This experiment illustrates an example that shows the contribution of the agent’s well-informedness, i.e., the richness of the set of relevant attributes, to the suitability and the rationality of the agent’s behaviour, so to its flexibility.
Here, we summarize the comparison of our orientation for the design of BDI agents with some representative BDI architectures, according to some flexibility requirements (see Table 6). The flexibility requirements include those features which were specified in the BDI model (i.e., the main flexibility requirements, see Section 2 for more details) plus some enriching and additional requirements related to enhancing performance, enhancing well-informedness and some refinements of the trade-off between reactive and deliberative behaviors (for more details about the enriching of the main flexibility requirements see Section 3.2).
We begin this comparison by giving a brief overview about the conformity to flexibility requirements of the discussed representative BDI architectures. Beginning by the main flexibility requirements, there is only ICAGENT architecture [53], among the representative BDI architectures, which includes all of them. The Saadi et al’s architecture [80] also includes all of those main requirements except for the goal generation task whose functioning was not detailed in [80] (see Table 6). Concerning the additional refinements of the trade-off between the reactive and deliberative behaviors (see Table 6), only ICAGENT includes them. This latter architecture and the Koch and Dignum’s architecture [50] are the best among the discussed BDI architectures when it comes to the covering of the performance related requirements. In terms of well-informedness, the Saadi et al’s architecture and the Vikhorev et al’s architecture [94] are interesting as they include additional relevant attributes such as goal’s urgency. Finally, the natural inclusion of motives in BDI architectures was not considered by any of the above architectures.
Comparison of the proposed approach with some representative BDI architectures. Cross (
) indicates satisfaction of the specific criterion
Comparison of the proposed approach with some representative BDI architectures. Cross (
The discussed BDI architectures include a variable and a partial coverage of the flexibility requirements; none of them has a complete coverage of these requirements. Therefore, our suggested orientation for the design of BDI architectures, includes all the above flexibility requirements. Although we have only sketched and given an abstract BDI architecture, we have given some details about of the various modules (the techniques used for their realization) and their interaction. For instance, the performance oriented requirements were incorporated in our approach by adopting the “parallel execution” principle and including the “perceptions filter” and “Environment monitoring” modules among the interfacing modules (the “sense” phase of the BDI reasoning), in order to focus the sensing to the relevant information.
Concerning the additional refinements of the trade-off between the reactive and deliberative behaviors, they were included in our approach via the “Meta-level Decision” module. For the natural inclusion of motives inside BDI agents, we got inspired from previous work [81], by including motives in the agent’s goal set via maintenance goals. Maintenance goals are connected to the “Goal generation” module in order to command the generation of goals necessary, for instance, to reestablish the maintenance condition (for more details see Section 3.4).
The BDI model aims to design flexible agents. The key ingredient of the BDI model that contributed in concretizing this flexibility is the inclusion of the practical reasoning capability. However, BDI practical reasoning, did not receive a sufficient attention from agent community: For performance and simplification reasons, some of the concepts included in the BDI model are not explicitly included by BDI architectures. Neglecting these concepts means that the ability of the agent to reason will be limited, hence leading to a more or less flexible reasoning, depending on the concepts explicitly included. In this paper we critically reviewed some representative BDI architectures regarding the flexibility, inherent from the practical reasoning, in the context of single agents not situated in uncertain environments.
To facilitate the study, we enriched and refined some existing flexibility requirements to provide a framework for reviewing the architectures. The proposed framework led us to set flexibility requirements that were neglected by the reviewed architectures. Building on the results of this review and the advantage of a BDI architecture to cover as many flexibility requirements as possible, we sketched a new orientation for the design of BDI agents which goes beyond the current state of art, by including the neglected flexibility requirements. More precisely, this new orientation consists in: (1) Improving the agent’s well-informedness of the actual BDI architectures, (2) Exploiting the similitude between the two concepts of “motive” and “maintain goal” to naturally include motives in BDI agents, (3) Incorporating the goals generation, (4) Considering the balance between the “lower-level” reactive and deliberative behaviors and the balance between “plan-portion” reactivity and “plan-portion” deliberation, (5) Including the filtering of perceptions, (6) Including a structure that materializes the relevant changes monitor, (7) Using a parallel architecture.
At the end of this paper, an experimentation was conducted as an example to evaluate one suggested improvement for the design of BDI agent, i.e., the agent’s well-informedness. More precisely, in the context of this experimentation, we have shown that adding the goal’s urgency and the estimated cost for realizing goals to the set of attributes relevant to the agent’s reasoning (more essentially, goal’s importance) contributes to the improvement of the behavioral flexibility.
As future work, it would be interesting to experiment and evaluate other suggested improvements for the design of BDI agents in real or simulated scenarios. Finally, it is worth extending and examining the flexibility notion, inherent from practical reasoning in the context of BDI agents living in multi-agent and uncertain environments. Due to the fact that HMAS (Holonic Multi-Agent Systems) [29, 32, 34, 35, 46, 52, 78] provides high degree of social oriented flexibility (self-organization), considering holonic multiagent systems when treating the flexibility in the context of multiagent systems is very much warranted.
Footnotes
A BDI architecture is a given concretization of the BDI model.
The feasibility checking, included in the deliberation of the classical scheme and concretization of Practical Reasoning, is reduced to conflict checking (between the new goal and the current focus of agent). This scheme is not conform to the feasibility property of BDI model. In fact, in this scheme the search for applicable plans (applicability checking), to achieve the new goal, takes place after the deliberation. This classical scheme was criticized and improved in [3,
]. The feasibility checking, included in ICAGENT’s deliberation, is conform with this improved scheme. In this new scheme, the feasibility checking includes, in addition to conflict checking, the applicability checking (see Section 3.4 for more details).
Plan executed in a deliberative manner means it is selected for execution after deliberation. On the other hand, plan executed in a reactive manner means it is selected directly for execution without deliberation.
Pro-activeness behavior means acting with planning and goal orientation.
Conflicting, i.e., the plan cannot be executed in parallel (interleaved) with the other active plans. A plan that does not conflict with plans in execution is qualified, in the literature, as “free-conflict” plan (see [80]) or equivalently as “executable” or “non-conflicting” plan (see [
]).
Although, the work of Zhang and Huang was the first that explicitly mentioned the advantage and impact of the parallel execution on the reactivity and performance of BDI agents, their proposed architecture is not the first parallel one for BDI agents. In fact, some prior BDI architectures such as Jadex [
] included this parallel execution principle.
The similitude between the two concepts of “motive” and “maintain goal” was indirectly indicated by Kowalski [55], where he suggested that maintain goals appear in the biological mechanism of homeostasis, which plants and animals use to sustain a stable relation with their environment (for more details, see [
]).
An achievement goal represents a goal in the classical sense by specifying the world state an agent wants to bring about. In other hand, a perform goal does not require the specification of a particular state of the world to be achieved as in the case for achievement goals. It specifies some tasks to be accomplished (For more details see for instance [
]).
A plan’s precondition is a condition that should be satisfied before executing the plan. In addition, a plan’s in-condition is a condition that should remain satisfied during the execution of the plan.
In the experimentation, we used the Saadi et al. architecture as a base template for the implementation of the three variants because its deliberation step manipulates the urgency attribute (which is a relevant characteristic for our experimentation).
The cost of interrupting g’ means in this context the global cost (for realizing goals) in the case of interrupting g’.
Appendix A. Proofs concerning Net utilities of the case study
Assume that Net_utility
Author’s Bios
