Abstract
Multi-agent systems is an evolving discipline that encompasses many different branches of research. The long-standing Agents at Aberdeen (
Introduction
Agents research at Aberdeen has a long history, with work ranging from fundamental theoretical research on topics such as trust [2,71], norms [51,114] and argumentation [109], to applications of such technologies [37,125,146]. The Agents at Aberdeen group1
In this paper, we focus on the work of the current members of the group rather than taking a historical perspective.
Resilience determines the ability to detect and deal (recover) with problems that may arise during execution (failures, unforeseen circumstances, etc.). Autonomous systems require resilience due to constant changes in the environment. There are many techniques for increasing resilience, such as runtime monitoring to detect failures [75,76], the use of norms and rules to design compliant systems [3,126], and automated planning to attempt to recover from failures [41,112]. Due to the increasing ubiquity of autonomous systems, there has been significant interest in making such systems more resilient. For example, the Trustworthy Autonomous Systems Hub [121], is an initiative funded by UKRI which includes an element focused on resilience.
A reliable system is a system that performs as expected. Reliability is critical when decisions made autonomously by a system are mission or safety critical, or may affect humans. A recent Dagstühl seminar [66] brought together researchers working on software engineering, verification, ethics, and machine learning to discuss reliability in MAS, and highlighted the topic’s importance. One of the outcomes of the seminar was a special issue [67] that posed two challenges: defining what is a good decision, and how to check that the system will choose to make good decisions. There are many ways of improving the reliability of a system and to try to tackle these challenges. For autonomous systems, the techniques often include verification and validation, ranging from formal verification to simulation-based and physical-based testing [78]. Reliability is somewhat similar to resilience since both deal with increasing the trustworthiness of a system, and the two concepts are therefore closely interconnected.
Coordination in a MAS concerns the coordination of individual and joint goals/actions of multiple agents. Despite this simple concept, coordination can be achieved using a multitude of different techniques from MAS. For example: agents can engage in argumentation to deliberate about how and when to coordinate and how to avoid conflicts arising from such coordination [161]; a trust system can be used to determine potential agents for coordination [1]; norms can be specified to guide agent coordination in order to avoid conflicts or improve efficiency [51]; multi-agent planning can solve scheduling problems in coordination as well as generate decentralised plans and/or centralised joint plans [41]; and using goal and plan recognition to reduce the need for explicit communication during coordination [131].
In this paper, we describe recent work done by

Word cloud containing the most common keywords in recent publications by the members of
This paper is organised as follows. In Section 2 we list our combined efforts towards resilience and reliability in autonomous systems. Section 3 briefly contextualises coordination in MAS and describes the research that we have done in various topics in this area. In Section 4 we discuss open research challenges in MAS including, but not limited to resilience, reliability, and coordination. Finally, Section 5 summarises the contributions of the group so far and highlights future work.
We use the word resilience to express the ability of an autonomous agent to withstand several types of failure while remaining functional (to some degree) and ensuring a minimum level of performance. The notion of reliability is more general than resilience as the agent must have consistently good performance in all situations. Reliability is also a prerequisite for a system to be trusted by other agents/users. While trust has many definitions [135] and is influenced by factors such as reputation, risk, experience, and transparency among others; the members of the
Formal verification of autonomous decision-making
One way of providing stakeholders with assurance in the reliability of a system is verification. Formal verification uses formal methods to prove specific properties (often related to requirements) of system behaviour. Examples of formal methods include model checking [49], where the desired property is specified in an appropriate logic and a model checking algorithm is used to determine whether the property holds for a model of the system; and runtime verification (RV) [31], which uses monitors to collect runtime events and then checks formal properties against these events. Many autonomous systems and robotic applications, particularly in safety-critical domains [104], require some form of formal verification to assure stakeholders that unsafe behaviours are not possible (or at least very unlikely) to occur, and/or that system objectives will be achieved. This is evidenced by, for example, the overview of verification challenges for inspection robots reported in [78] which also describes the common issues encountered in verifying remote inspection tasks during the authors’ experience in three research hubs within the UK’s “Robots for a Safer World” programme and which included involvement by members of
Another challenge in verification of agents is modelling the environment due to its dynamic nature, which often results in an incomplete abstraction of the real environment. Therefore, a static verification of the system using this abstraction of the environment does not provide sufficient assurances about the real reliability of the system. In [77], Ferrando, Dennis, and Cardoso et al. introduce a domain specific language to model the environment in such a way that a model checker can statically verify the environment. This language also allows the compilation of runtime monitors that can verify the same constraints at runtime, effectively checking if the environment abstraction holds during execution.
Members of the
Formal verification is a post-design (or in the case of runtime verification post-development) approach, in that it checks properties of an existing agent design, program or system. If the desired property does not hold, the system design or implementation must be revised by a developer and re-verified in an iterative process. An alternative approach is to derive a (correct) behaviour, plan, program or policy from the desired property together with a model of the environment and the information available to an agent. Such approaches differ from learning-based model-free techniques in three key ways [81]. First, they require no data in order to decide valid actions in any given environment. Second, they provide formal guarantees about the resulting policies that learning-based approaches cannot. Finally, while no data is required, the (offline or online) computational cost of deriving a plan or program may be high. Similarly, planning differs from preprogrammed policies (e.g., using agent programming languages) in that it relieves the burden on the designer in accounting for every possible contingency imposed by the environment.
Work in the
In terms of AI behaviour composition,
Members of the
While most research in the past 20 years on automated planning has focused on algorithms for state-space search using formalisms for domain-independent planning, recent efforts [90] have revived interest in HTN planning, leading to new algorithms and heuristics. By contrast, members of our group have leveraged techniques from BDI agent reasoning cycles to substantially optimise HTN planning, leading to the development of HyperTensioN [106].This planner won the 2020 International Planning Competition for Hierarchical Planning against contenders from multiple research groups. Meneguzzi and others have also made substantial contributions to planning using reinforcement learning and imitation learning achieving a high degree of sample efficiency [79].
Finally, we have consistently made contributions to applications of planning formalisms and algorithms. Specifically, Meneguzzi and others have applied planning techniques to a number of domains, both within computer science, and in transdisciplinary research. First, Meneguzzi et al. automatically generate and validate agent commitment protocols [115,116,145] using an HTN planning formalism. Second, Meneguzzi et al. carried out a body of research on applying a combination of automated planning and machine learning as a tool for neuroscientific research [70,128].
Intention progression
Autonomous agents often have to achieve multiple goals in parallel. Even if the behaviour or plan to achieve each goal is (provably) correct, e.g., has been verified or derived from a specification of the goal or task, the problem of acting (plan execution) remains [83]. The Intention Progression Problem (IPP) is the problem of what a Belief-Desire-Intention (BDI) agent should do next [101], that is, what means (i.e., plan) to use to achieve a given (sub)goal; and which of the currently adopted plans (i.e., intentions), to progress at the current moment. An important capability of an intelligent agent is the ability to progress multiple intentions in parallel, by interleaving the steps in each intention to provide the best outcome for the agent. This problem is both central to agent reasoning and complex in its nature. For example, ‘best outcome’ may have different definitions depending on the application, while goals and plans may conflict given the resources available. A key challenge is the interleaving of steps in plans in different intentions to avoid conflicts, i.e., when the execution of a step in one plan makes the execution of a step in another concurrently executing plan impossible.
Members of the
As agents are becoming more complex and ubiquitous in our everyday lives, it has become crucial to be able to explain their behaviour and allow them to have profound interactions with humans. While there are myriads of explanation techniques using graphs, statistics, and natural language descriptions, argumentation has a unique advantage in transparently explaining the procedure and the results of reasoning through an “arguing process”. The usual process of arguing can be viewed as (1) identifying and modelling the information at hand and (2) generating an explanation for the topic, usually through some fictitious proponent and opponent debate game. Of course, the structure of the arguments and the protocol of the debate game will vary depending on the desired expressiveness of the explanations and the nature of the topic.
For instance, Yun, Vesic and Oren [168] proposed a dialectical framework based on argumentation to compute and explain pure strategy Nash equilibria. Roughly speaking, this framework extends the framework of [118] (allowing attacks on attacks) and models a two-agent dialogue, where the proponent’s goal is to show that an argument is a Nash equilibrium while the opponent seeks to demonstrate that the proponent’s argument is not a Nash equilibrium by proposing alternatives. Another example is the work of Dennis and Oren [65], where they proposed a dialogue-based approach for explaining the behaviour of a BDI agent. This dialogue considers two participants, who may have different views regarding the beliefs, plans and external events, and make utterances which incrementally reveal their traces to each other, allowing them to identify divergences or to conclude that their traces agree. Moreover, as part of the EPSRC funded “Scrutable Autonomous Systems” project, the
The members of the
Computational trust and normative reasoning
In an open MAS, agents must identify others to coordinate with, either to undertake joint actions, or to delegate tasks to. A core problem here is that some agents within the system may be malicious, or simply incompetent, and such agents should ideally be avoided, or that measures be put into place mitigating the effect of such agents in order to improve resilience and reliability of the system.
A trust and reputation system allows an agent to combine their direct experiences of others (trust) and obtain information from third parties about others (reputation), so as to compute a reliability ordering about these other agents. This reliability ordering can then be used to decide who to interact with [1,37], or to put other measures into place so as to minimise the harm these other agents can cause. For example, we combined a trust rating with principal agent theory to determine how much utility transfer should take place between a delegator and a delegatee, and how much monitoring the delegator should undertake of the delegatee’s behaviour [37,48].
The Eigentrust framework [93] is a simple and popular approach to computing trust among agents, originally used in the context of peer-to-peer file sharing. Eigentrust assumes that trust is transitive (if A trusts B and B trusts C then A should trust C), and that an agent’s ability to provide reputational information is correlated to its ability to perform a task. It then represents the (direct) trust relationship between agents in a matrix. Multiplying a vector by this matrix yields a new vector representing direct trust, and – under the assumptions described above – further multiplications of the resultant vector by the matrix yield a new vector describing first hand, second hand, and so on reputational information. However, Eigentrust does not discriminate between lack of information about an agent, and lack of trust in the agent. We proposed the MaxTrust algorithm [2] to take this difference into account, and it has been shown to obtain more accurate trust ratings in some situations.
Another element of work on trust taking place in the
While trust allows for the filtering of interaction partners, another aspect of resilience and robustness involves putting appropriate sanctions into place to punish misbehaving agents, and to compensate those who suffered. The
A key problem in normative MAS is detecting norm violations. For large-scale MAS, monitoring for violations may involve significant computational costs. Members of the
Norms may not always be explicit, and an agent entering a new system may need to learn the system’s norms. We have described a plan recognition based approach to doing so [123], which effectively filters out norms inconsistent with executed plans. A refinement to this work [51] used a Bayesian approach to identify the most likely norms.
Coordination
There is a large body of research dealing with coordination among agents including work in game theory [141] and multi-agent planning [35,147]. At
Making decisions in a multi-agent setting
The problem of making a decision when multiple agents are involved is very complex, especially when many parameters must be taken into account for the decision. There can be one or several global costs functions that need to be optimised, agents can have conflicting viewpoints about the situation at hand and can also have different objectives. Within
It should also be noted that within a MAS, concepts such as trust, goals, obligations and permissions, as well as the provenance of information also affect decision-making. Research within
While there are increasing evidence that argumentation-based approaches are able to model human reasoning to a certain extent [50,136], the generation of arguments [167] and the expensive computation of some argumentation extensions [69], such as the preferred-semantics, are often limitations that prevent their wide-adoption in settings where fast decision-making is required. To accommodate such settings, Yun et al. have also considered rule-based approaches that can be applied directly to knowledge bases in the Ontology-Based Data Access (OBDA) setting [94,166]. In this setting, a set of well-designed rules (an ontology), is used to access and combine the knowledge of each agent, allowing for heterogeneous agents with different vocabularies (e.g., agents working in different application domain) to seamlessly communicate. For instance, in [94], a multi-criteria decision-making system is proposed to model each agent criteria evaluation over the alternatives using expressive languages. This avenue of research was further investigated by Yun et al. [166], who combined inconsistency measures with Shapley techniques to assist decision-making. The approach was successfully applied on the aforementioned packaging use-case, allowing for the data from more than 20 professionals working in the food industry to be aggregated and used in the decision-making process.
Multi-agent planning
Multi-Agent Planning (MAP) had many different interpretations over the years, but in general the overall process can be interpreted around two main aspects: a) the planning process itself is either centralised (performed by a single agent) or distributed (performed by multiple agents), and b) the solution is for a single agent or for multiple agents. Planning done by a single agent (centralised planner) will encounter the search state-space explosion problem as the number of agents to plan for increases. Distributing/decentralising the planning process can lead to faster computation times depending on the number of agents to plan for and how tightly coupled the actions of these agents are. Tightly coupled actions require more coordination before/after planning, but in loosely coupled scenarios using multiple agents to perform planning can result in substantial improvements in planning time [41]. Another advantage of MAP is that it can utilise privacy-preserving algorithms [110] to maintain various levels of privacy during planning so that private planning information (such as actions, goals, etc.) are not shared amongst agents.
Cardoso and Bordini developed the Decentralised Online Multi-Agent Planning (DOMAP) framework [41] that combines HTN planning with the JaCaMo MAS development platform [32], resulting in BDI agents that can plan at runtime using HTN planning, and then coordinate their actions during execution using JaCaMo’s organisational dimension. The most notable contributions of the framework is that it can be used to bridge the gap between planning and execution often found in the literature, it provides fair goal allocation prior to planning using a contract net protocol mechanism, and it performs decentralised HTN planning using an off-the-shelf HTN planner. Results show that DOMAP outperforms the best planners from the 2015 International Competition of Distributed and Multi-Agent Planners (CoDMAP) [96] in terms of planning time, execution time, and parallelism (variance of the plan size of each individual agent, used to indicate the spread of actions and to identify how well execution loads are balanced) for the most difficult problems (large number of agents). As shown by DOMAP, MAS dimensions (agent, environment, and organisation) and agent programming can help to bridge the gap between planning and execution, and future work in this topic can tackle some of the most difficult problems in MAP such as dealing with conflicting actions, coordinating concurrent and joint actions, and enforcing privacy constraints.
Goal and plan recognition
Goal Recognition is a task related to automated planning (Section 2.2), where an agent employs abductive reasoning to infer the most likely goal pursued by another agent [113]. The evidence for this kind of reasoning often consists of a sequence of observations of the observed agent’s plan. Here, instead of deducing a plan from an initial state towards a goal using some kind of domain theory, the deduction is about filtering the correct goal out of a set of goal hypotheses. Research on goal recognition lies within the context of Plan, Activity, and Intent Recognition [144] and employs distinct inference techniques to recognise the ultimate goals of agents under observation. Goal recognition is also related to the problem of Plan Recognition [117], which consists of trying to infer the actual plan adopted by the observed agent. The task of goal recognition has a number of potential and actual applications, including assisting the handicapped [82], activities of daily living[86], workplace safety [92], among others [143,152].
In the context of the
On the theoretical side, Meneguzzi, Oren and Pereira developed the current generation of the state-of-the-art in goal and plan recognition algorithms. This has started with the application of planning landmarks to perform accurate and efficient goal recognition [88,129–131]. Building upon this work, the group developed approaches for online goal recognition in continuous domains [149], as well as in incomplete [133], and learned domains [134]. The latest iteration of novel goal recognition techniques employs linear programming and operator-counting heuristics to perform goal and plan recognition under noisy and low observability conditions [56]. More recently, Meneguzzi and others developed a series of techniques for hybrid, neuro-symbolic reasoning towards goal recognition, to both automatically derive symbolic representations from unstructured data [25–27,84–86], and enhancing the recognition process with learned preferences of the agents under observation [29]. Finally, Meneguzzi and others have bridged the gap between reinforcement learning and goal recognition techniques by providing the first formalisation and efficient algorithms for goal recognition as reinforcement learning [28].
On the practical side, Meneguzzi and others have developed a number of applications of plan and activity recognition, including activities of daily living [120] and scene recognition [87]. Goal recognition is an area that is at least as broad as the applications of automated planning itself [81]. These areas include network security [89], crowd safety [95], among others.
If the goals of other agents can be recognised (or are known), then agents are able to coordinate more effectively. The multi-agent intention progression problem for BDI agents extends the IPP presented in Section 2.3. In the multi-agent setting, how an agent progresses its intentions has implications for both the achievement of its own goals and the achievement of the goals of other agents, e.g., if the agent selects a plan that consumes a resource necessary for another agent to achieve its goal. In [54], the MCTS-based approach to intention progression discussed in Section 2.3 is extended to support intention aware scheduling, where each agent attempts to anticipate the possible actions of other agents in the environment when progressing their own intentions. Intention-aware scheduling was shown to be more effective than the approaches in [156] in cooperative, selfish and competitive environments. More recently, this approach to intention-aware scheduling has been extended to the case in which the plans used by other agents are unknown, and agents use an abstraction of their own program called a partially-ordered goal-plan tree (pGPT) to schedule their intentions and predict the actions of other agents [55].
Multi-agent programming
Agent programming languages focus on rational agents, which are computational programs that usually make use of a reasoning cycle in order to make their decisions. The most common reasoning model in existing agent programming languages follows the BDI model, as shown in recent literature surveys [34,43,63,100].
The Multi-Agent Programming Contest (MAPC)5
Members of
Recent developments in AI, and particularly in machine learning, have the potential to significantly impact the development and runtime behaviour of agent systems. However, while there has been remarkable advances in machine learning, many tasks remain resistant to learning-based approaches, for reasons of explainability, the need for commonsense or causal reasoning, among many others. We believe that future agent systems must exploit the advantages of both symbolic and sub-symbolic techniques to ensure reliability and resilience in challenging domains, such as long-term autonomy, human-robot teamwork, and hybrid intelligence. For example, such systems must be capable of robust autonomous decision-making taking into account socio-technical concepts (including accountability, responsibility and trust). To be trusted by users and approved by regulators, we must be able to offer guarantees that the behaviour of such systems is, at the very least, safe. This will require the development of new approaches to the verification and synthesis of MAS that combine symbolic and sub-symbolic approaches. Verification of such hybrid systems is extremely challenging, and will require significant advances across a range of areas, including logics of strategic ability, automata-based learning, stochastic search and quantitative games on graphs, among others.
We believe that argumentation and dialogue can also address several challenges related to agent reasoning, coordination and explanation. One strand of active research involves investigating how closely complex argumentation systems (e.g., those that incorporate weights or structured arguments) mimic human reasoning [136,150], and how such systems can be made to comply with desirable properties. Such work can allow an agent to infer a human’s preferences [109], or identify how strong information needs to be to convince a user of some conclusion [127]. At a more applied level, we intend to investigate argument-based reasoning in the context of socio-technical concepts such as trust, responsibility and accountability [52].
Despite recent trends in deep learning [148], key applications of AI in our daily lives will involve collaboration between AI systems and humans [36]. Such collaboration imposes several challenges to AI systems, one of which involves coordinating the activities of AI-driven algorithms and their human partners. To accomplish such coordination, AI systems must be capable of reasoning about a number of aspects of cognition and behaviour. This includes inferring the intentions of their human interlocutors (Section 3.3), reaching consensus among mixed teams of agents and humans (Section 3.1), and planning for joint courses of actions (Section 3.2). Our group envisions addressing these kinds of challenges using a mixture of classic AI models (for robustness and explainability), as well as machine learning models to deal with inherently noisy environmental data. Recent work in this direction includes eliciting (human) preferences from the arguments they advance [109], and identifying what strength an argument requires to be acceptable to a human reasoner [127].
Most of our work in the past has been in symbolic AI (representation of AI problems through the use of symbols, simulating the human reasoning process) with some applications of machine learning [29]. However, there are still many open challenges and potential advantages of combining symbolic AI and machine learning techniques. For example, symbolic AI has been shown to be effective in scenarios where reactivity is necessary [46], while machine learning has demonstrated excellent results in scenarios where learning is essential [119]. Therefore, scenarios that require both reactive and learning behaviours, such as self-driving cars or autonomous vehicles, would greatly benefit from such integration. Moreover, with recent advances in neuro-symbolic AI [64], we can now align ontologies and concepts with neural networks. A new research avenue will be to integrate argumentation with such technologies to boost the reliance of neural network systems by verifying that they comply with basic rules and possibly discover hidden patterns in data. This will also allow
Alongside the purely architectural questions of how symbolic and sub-symbolic approaches can be effectively combined, work in this direction gives rise to a range of new research problems centred around the notion of bounded adaptation [34]. How should the split between predefined or canonical behaviours and learned behaviours (e.g., refinements, or implementations of very high-level actions, etc.) be characterised? What development methodologies and verification approaches can be used to specify and certify the behaviour of agents that integrate significant AI capabilities into their decision making? This can be seen as establishing a new strand of research exploring hybrids of programming-based, learning-based, and model-based approaches to developing AI capabilities [80]. We believe that the agent programming paradigm forms an ideal framework in which to explore the resulting complex mix of scientific and engineering questions.
Since several members of
Conclusion
We have discussed recent work in
There are still many open challenges that are directly related to resilience, reliability, and coordination in MAS that we will continue to investigate by extending some of the work reported here. Nevertheless, we also plan to use our past experience in these research themes to combine and apply them with socio-technical concepts, human-agent scenarios, and hybrid symbolic and statistical agents.
Footnotes
Acknowledgements
The research reported in this paper was funded and supported by various grants over the years: Robotics and AI in Nuclear (RAIN) Hub (EP/R026084/1); Future AI and Robotics for Space (FAIR-SPACE) Hub (EP/R026092/1); Offshore Robotics for Certification of Assets (ORCA) Hub (EP/R026173/1); the Royal Academy of Engineering under the Chair in Emerging Technologies scheme; Trustworthy Autonomous Systems “Verifiability Node” (EP/V026801); Scrutable Autonomous Systems (EP/J012084/1); Supporting Security Policy with Effective Digital Intervention (EP/P011829/1); The International Technology Alliance in Network and Information Sciences.
