Abstract
The self-adaptive multi-agent system requires adaptive adjustments based on the dynamic environment during its runtime. Heterogeneous agent can accomplish different task goals, enhance the efficiency of system operation, but its complex collaboration problem poses new challenges to the study of verification of adaptive policies for heterogeneous multi-agents. This paper proposes a runtime verification method for self-adaptive multi-agent systems using probabilistic timed automata. The method constructs a probabilistic timed automaton model by formally describing the functional characteristics of heterogeneous agents and integrating random factors in the environment to simulate the operation process of the self-adaptive multi-agent system. Regarding the collaboration logic among heterogeneous agents, security constraints are established to ensure the security of state transition processes during system operation. Combining model checking with runtime quantitative verification methods to conduct experiment and applying it in the case of an intelligent unmanned parking system. Experimental results manifest the correctness of the cooperation logic between agents can effectively ensure the stability of the system at runtime. Significant improvement in system uptime and efficiency compared to the initial system without runtime quantitative validation.
Keywords
Introduction
Software systems characterized by functionality that is fixed at design-time are being replaced by systems with dynamic change and adaptive capabilities [1, 2]. More advanced systems are no longer simply monitoring their own behavior, but are more likely to predict and discovering deviations from desired behavior. Through a closed-loop control loop [3], with limited managerial involvement, an self-adaptive system (SAS) can change its behavior and internal structure in response to changing environmental conditions, changing demands, and internal changes to continuous satisfaction of Quality of Service (QoS) [4]. Adaptive capability refers to the ability of the agent to make system decisions independently based on the collected environmental data. The level of adaptive ability can be regarded as whether the agent can make correct decisions according to the change of environmental conditions. Whether the decision is correct or not will have a significant impact on the operation process of SAS. For example, the decision of SAS makes the system state transition wrong, which will lead to major consequences such as deadlock in system operation. Therefore, self-adaptive verification in complex environments has become one of the urgent problems to be solved in the development of reliable SAS [5]. Self-adaptive verification is moving from design-time to run-time and single functions to complex functions
Initially, system testing was used to improve the reliability of SAS decisions [6, 7]. This method determines whether the function of SAS is qualified through various test devices and interfaces. However, the system testing methods can neither predict and emulate the various system behaviors that may occur when the SAS runs in dynamic environments, and nor can they reflect the running situation of the system in dynamic environments. D. Weyns et al [8], has conducted research on the formal description methods of SAS, and pointed out that the formal description method can provide QoS for SAS by specifying and verifying various behaviors that SAS may produce [9–12]. Model checking technology was the concrete application of the formalization description method. This technology formalizes SAS as a verifiable system model, and verifies whether its attributes satisfied the given formula specification [13]. However, the number of states in the system model will affect the verification process and can only be applied to small systems with finite states. In order to make the model checking technology can be applied to more complex large SAS. Based on model checking, mathematical analysis and logical formalism are introduced to assist with the verification of large-scale or infinite-state system [14, 15]. The method is to reduce the impact of path repetition and the number of states on verification efficiency by symbolic expression equivalence verification. Dobson et al has studied the verification of SAS security, accessibility and stability, then provided more perspectives for the verification of SAS [16–18].
However, the above methods are usually applied in the off-line phase such as system design or maintenance, and unable to deal with the random uncertainties that appear during the runtime. Self-adaptive systems face random uncertainty in their environment during operation. For example, dynamic obstacles and weather conditions in the environment. We need to consider the effects of uncertainties to ensure the reliability of the validation results. If significant changes such as requirements and hardware device failures occur during the operation of SAS, the QoS of SAS may decrease, and even cause the system function unusable situation in severe cases. Therefore, the verification of SAS should adapt to the dynamic changes of environment and requirements. and ensure the security and stability when SAS is running. In this paper, security refers to the state changes staying within the set rules, stability refers to the state changes keeping the same trace.
The verification of SAS driven by formal methods at runtime can be a feasible solution to the above problems [19]. Runtime Quantitative Verification (RQV) is a mathematical method based on formal drive SAS reconfiguraation. The RQV technique involves continuous monitoring of the system and its environment to establish current environmental parameters and system state. Relevant changes are diagnosed and quantified, and selected models are analyzed using quantitative validation to identify (and in some cases predict) violations of QoS requirements such as response time, availability, and cost. We construct stochastic models of SAS and its environment, monitor the models using RQV, and validate the system control loops through formal analysis of the models. R. Calinescu et al introduced the concept of RQV into the field of SAS verification [20, 21], and applied it in the cases of dynamic reconfiguration of telemedicine service system [22] and dynamic resource management of cloud computing infrastructure [23] respectively.
As the application scenarios of SAS become more and more complex, more and more functional goals are achieved, SAS needs to develop in the direction of cooperation and distribution to have better concurrency and fault tolerance [24]. Therefore, Multi-Agent System (MAS) and its cooperation mode and distributed optimization technology have attracted the attention of researchers. A multi-agent system is a collection of multiple agents that builds a large and complex system into one that can collaborate and communicate with each other and is easy to manage. Multiple individuals working together to accomplish a task have the advantages of low cost, strong survivability, high execution efficiency, and high flexibility compared to a single individual performing a task. Due to the different features and functions of each agent [25], its state space dimension and corresponding dynamic information, multi-agent systems can be categorized into homogeneous multi-agent systems [26, 27] and heterogeneous multi-agent systems [28]. However, the SAS described by the above researches using RQV are all based on homogeneous MAS, not considered the existence of heterogeneous Multi-Agent Self-Adaptive Systems (MA-SAS), and lack of formal description of the cooperation logic between heterogeneous agents. In order to verify the heterogeneous MA-SAS, it is indispensable to solve the problem of formal modeling and cooperation logic description of heterogeneous agent.
In general, this paper constructed a probabilistic model for a multi-agent self-adaptive system in a random environment, focused on the problem of cooperation between heterogeneous agents and proposes a formal solution, combined model checking and RQV techniques, deployed at runtime to execute validation models to ensure system security and stability. This paper starts with Probabilistic Timed Automata (PTA) and heterogeneous MA-SAS in dynamic environment. Combining RQV proposes a heterogeneous multi-agent self-adaptive runtime verification method. Firstly, it constructed the PTA model according to the heterogeneous agents in the system, then combined the PTA model to describe the SAS as a verifiable Probabilistic Timed Automata Network (PTAN) model and loaded it into the model checker. Secondly, according to the coordination logic of heterogeneous agent in SAS to formulate security constraints. Formalizing the security constraints and incorporating them into the PTAN model. The PTAN model was executed under the security constraints to ensure security of the state transition process during the system operation. Finally, the functional behavior of SAS was formalized as a timed computation tree logic, which was used to quantify the functional requirements of the system as the verification target. Through the model checking and RQV analysis PTAN model.
The main contributions of this paper are as follows: This paper contributes the Heterogeneous Multi-Agent Runtime Verification method (HMARV). In this method, the heterogeneous agents are formalized as PTA models and combined into a PTAN model to simulate MA-SAS. This paper proposes a formal description method of cooperation logic between heterogeneous agents to solve the problem of lacking formal description of cooperation logic of heterogeneous agents in MA-SAS. This method transforms the cooperation logic into a model language and adds it to the system model, the agent cooperation logic can also be verified by model checking. This ensures the stability and security of state transfer during system operation. RQV technology is extended to heterogeneous MAS, and the functional behavior of the system is quantified as the verification index. The experiment compares the accuracy and efficiency of the system without RQV and the system with RQV in the running process, and the results manifest that RQV can effectively improve the reliability of system runtime. The method in this paper has been successfully applied in a Intelligent Unmanned Parking System. By comparing the HMARV system with the original system in terms of the percentage of running time in normal status and the number of tasks completed, this paper successfully demonstrates the effectiveness of the HMARV method in improving the stability of MA-SAS in operation.
Comparison of related work
A. R. Lomuscio et al constructed agents with random behaviors as Markov Decision Processes (MDPs), and combined Alternating Time Logic (ATL) to establish agent checking metrics. However, the verified object entities are different from those in this paper, and the case of collaboration problems between different types of agents is not considered [29].
R. Gadelha et al represented scenarios in text or graph to describe a set of event sequences of SAS, used the transition probabilities between scenarios to express the behavior specification of SAS and used model checking techniques to verify whether the reachability property holds. It is consistent with the logic of this paper, but its SAS requirements for verification have certain limitations, which cannot express all behavioral events of SAS and requires a large amount of scenario data to support the correctness of verification results [30].
M. Casimiro et al applied the formal probabilistic framework based on model checking to the system framework based on machine learning, modeled uncertain behaviors through MDPs and generated optimal policies based on reward attributes [31]. G. Mason et al uses abstract strategies to define regions of agent behavior space in the system, and combines RQV to identify behavior strategies that satisfy constraints [32]. However, its model framework is based on a single agent, which does not involve heterogeneous agents and collaboration problems, the constructed probabilistic model is mainly for the retraining module in machine learning, which can express limited functional behaviors.
D. Weyns et al constructed a verification model based on self-adaptive feedback control loop MAPE-K [33], and used a set of correctness attribute verification models to represent the four main stages of MAPE-K [34]. The system model constructed by SAS in a specific scenario includes the system functional behavior and the four main phases of MAPE-K, which can be directly deployed and executed at runtime. But compared in this paper, the verified model does not involve the concept of multi-agent, and is not suitable for multiple feedback loop system. And their experimental models express different objects than ours, they express specific decision-making behavior, we express the functional behavior of the agent.
M. Carwehl et al present a runtime verification approach for self-adaptive systems with changing requirements, it uses property specification patterns to automatically obtain automata with precise semantics to reflect requirement changes and enable continuous verification [35]. Differences also exist in the experimental process and objectives, where they tested the impact of the efficiency of the adaptation model generation on SAS. We examined the improvement of SAS by multi-agent collaborative modeling and RQV.
T. Vogel et al present a framework for formal modeling and analysing self-adaptive systems, contribute a formal method, that exploits the concept of multi-agent Abstract State Machines to specify distributed and decentralized adaptation control in terms of MAPE-K control loops [36]. And regards properties related to the application requirements, such as invariants, general properties on the controlled part of the system, adaptation goals expressed as reachable and liveness conditions, that can be verified. Our formal method focuses on collaboration between agents and verifies the functional behavior of the agents as an adaptation target, the aim is to ensure the security of agent collaboration.
Modeling foundations and property specification
Clock and clock constraints
Timed automata are finite automata with clock sets [37], let X be a finite set of clocks. x ∈ R ∩ [0, + ∞], denoting the clock variable that records time. An action to assign a value to a clock variable is defined as a mapping of the clock variable to the set of nonnegative real numbers, denoted X → R, R ≥ 0. Let C be a clock constraint composed of Boolean variables such that the result of a clock constraint C is True if the value of a clock variable x satisfies its clock constraint C, denoted as C = = True, which can be defined in these forms: x ≤ n; n ≤ x; x1 + n1 ≤ x2 + n2; ¬C; C ∩ C, where: x1, x2, …, x i ∈ X. n1, n2 are natural numbers. Let C denote the set of clock constraints on the set X of clock variables if all x present in C such that C = = True.
The transition from state s to s′ is denoted as (s, C, a, U, s′) ∈ E. This formula means that when in state s, the timed automaton TA can execute action a as long as the clock constraint C is satisfied, and the clock variable in U will be reassigned so that the state is transferred from s to s′ without violating the clock constraint of I (s′). TA can also stay in state s if the clock constraint I (s) is continuously satisfied.
Probabilistic timed automata model
Model checking is one of the main methods to solve the problem of self-adaptive verification. Its basic idea is to establish a mathematical model of the system behavior, and then perform formal analysis based on the model [38]. Timed automata is a theory for modeling and verification of real systems [39, 40], and the system behavior and structural characteristics of SAS can be modeled as timed automata models for verification and analysis. However, SAS needs to adapt to the dynamic changes of the environment, and its behavior is affected by uncertain factors with uncertain characteristics. Therefore, the validation model of SAS also needs to have the ability to analyze the uncertainty factors. Uncertainty can be classified into various types according to its nature and source, such as internal or external [41], parametric or nonparametric [42], and constant, random [43], characteristic [44], and it also involves a variety of domains.
Probabilistic timed automata are a modelling formalism for systems that exhibit probabilistic, nondeterministic and real-time characteristics. [45], which can be used to describe and analyze the behavior of real-time systems with random factors. Probabilistic timed automata model real-time behaviour in the same fashion as classical timed automata, using clocks. Its definition is extended on the basis of timed automata. The random uncertainty in the internal and external environment is an unavoidable factor of SAS. In order to verify SAS security more effectively, it is intensely indispensable to incorporate this randomness into the verification work. Assuming that some specific aspects of SAS behavior can be described in the form of probability, this kind of behavior can be modeled as a stochastic process. For example, the random uncertainty from the system environment can be abstracted in the form of probability, and a probabilistic timed automaton model can be constructed for verification analysis. If an agent status in the system has many transfer paths, the various paths can be abstracted into system state change routes that are influenced by various random factors, makes the agent transfer state according to probability.
The transition of a probabilistic state can be denoted as (s, C, a, P) ∈ T, Denote that when in state s, the probabilistic timed automaton PTA can execute action a as long as the clock constraint C is satisfied, and the clock variables in U will be reassigned such that the state is transferred from s to s′ with probability P (U, s′), while the clock constraint of I (s′) is not violated. Similar to the TA model, the PTA can also stay in state s if the clock constraint I (s) is continuously satisfied.
SAS with heterogeneous multi-agent can adapt to more complex application scenarios, but how to build a heterogeneous multi-agent SAS model has become the difficulty of its verification. According to Definition 1 and Definition 2, the heterogeneous agent in SAS are respectively established as TA models and the random uncertainty factors are abstracted as probability forms. The TA model is extended to the PTA model. From Definition 3, multiple PTA models can be combined into a single PTAN model. When a complex SAS is composed of many homogeneous or heterogeneous agent, the PTAN model composed of multi-agent PTA models is the verifiable system model of SAS.
Property specification for probabilistic timed automata
After the system modeling of SAS is obtained, the system model needs to be analyzed and verified. Timed Computation Tree Logic (TCTL) is a set of temporal property specifications, which can be used to formally express system properties that need to be verified. Its formula is described as Table 1.
TCTL formula description form
TCTL formula description form
The letters A and E are used to quantify paths, according to the definition of the formulas in Table 1. A is used to indicate that a given property should hold for all paths of the tree, while E indicates that there should be at least one path of the tree containing the property. The symbols [] and < >are used to quantify states in a path. [] indicates that all states on the path should satisfy the property, while < >indicates that at least one state in the execution satisfies the property. E< >and A[] are A set of dual properties, that is, E< >is satisfied if and only if A[] is not. This type of property is often classified as a security property, meaning that the system is safe in the sense that a particular hazard will not occur.
The model property checking of SAS constructed by TCTL can make a preliminary security verification of system structural properties.The state formula and path formula syntax of TCTL are shown in Equations (1) and (2).
Where: φ is the status; ψ is the path; a ∈ AP is an atomic proposition and AP is a set of atomic propositions; χ ∈ C (X) is a clock constraint; Δ ∈ {< , ≤ , ≥ , >} is a relational operator. p ∈ Q ∩ [0, 1], q ∈ Q≥0, r is a reward structure, k ∈ N.P Δ p [ψ] means that the probability of path formula being true satisfies probability bound P, and R r Δ q [ρ] means that the expected value of reward function ρ on reward structure r satisfies its bound.
Formulas in logic are often state formula, i.e., formed by φ in the syntax described above. These are evaluated on the basis of the state of the P. For the state s and the formula φ, denote by s| = φ that φ is satisfied in s. The syntax also includes path formula ψ and reward operator ρ, which appear only in subformula of the P and R operator. The path formula consists of two types: time-bounded until (φ1 ∪ ≤kφ2) and unbounded until (φ1 ∪ φ2). The φ1 ∪ φ2 denotes the eventual attainment of a state that satisfies φ2 and satisfies φ1 at every moment prior to that. The φ1 ∪ ≤kφ2 have the same meaning, but with the additional constraint that the occurrence of φ2 must occur at time k.
HMARV framework
This paper proposes a runtime verification framework for MA-SAS based on the probabilistic timed automata model, as shown in Fig. 1. The framework is based on the MAPE-K feedback control loop in SAS, and the multi-agent model in SAS is modeled and loaded into the model checker to execute, which is verified by RQV and model checking. The following paragraphs briefly describe each module of the HMARV framework.

Framework of HMARV.
In heterogeneous agent cooperation, the differences between agents make heterogeneous agents may have different behaviors and execution strategies. It is more difficult to coordinate and harmonize strategies so that agents can cooperate on the same goal. That is, it is necessary to develop a clear way for heterogeneous agents to cooperate and unify the task goals. To address this difficulty, we design agent cooperation logic, generalize how agents cooperate and propose a formal solution.
The formalized approach is a preparation for applying model checking. Model checking is a formal verification technique that verifies whether the design or specification of a system satisfies a given property or rule, so to verify a MA-SAS with cooperation problems, it is necessary to formally describe how the agents cooperate. We construct agent cooperation logic through a probabilistic timed automata modeling language, the model checker used is UPPAAL [46]. In UPPAAL, a model checker adapted to this model language, multiple models can be used for simultaneous jumps and inter-model inter-collaboration through Guard, Synchronization and Update functions. Therefore, heterogeneous agents with different functions can be modeled separately in the same environment. The temporal constraints between agents can be designed to achieve the interactions through Guard; The collaborative state changes through Synchronisation; The data updates of agents through Update.
Defining and expressing system properties or specifications is a critical step when applying model checking. Converting properties described in natural language into formalized specifications is a difficult task that can lead to ambiguous or inaccurate specifications, which in turn affects the accuracy of the checking results. We express system properties logically through a temporal computation tree, formally describe the system’s task goals, and derive checking results in a model checker.
There are heterogeneous agent with various functional types in complex SAS, and they need to cooperate and communicate with each other to accomplish task goals.
Agent cooperation logic
In complex SAS, different types of agents are needed to accomplish different task goals, and a task goal may require the joint participation of multiple agents. Suppose a task objective necessitates cooperative engagement between Agents A and B to accomplish successively. A needs to communicate with B while executing its own task to ensure that B can carry out the task after the interaction. Checking the status of the agent at task handover keeps the system in a safe and reliable state at all times. Task goals determine how often agents collaborate with each other and what conditions are met when they collaborate. Agents are in states such as waiting, interacting, or checking during the collaborative execution of a task, referred to as cooperation logic in this study. The order in which agent cooperates to perform a task, the state it is in, and when it begins to perform the task are referred to as cooperative modes in this study.
The cooperation logic of an agent represents its state transition logic. According to the requirements of system task goals, an agent may require more than one cooperation logic and take part in the accomplishment of more than one task goal. Homogeneous agents have the same cooperation logic, heterogeneous agents have different cooperation logics due to differences in functional structures. In this paper, the interactive behavior between agents is called the interactive event, which is initiated by a single agent and may be received by multiple homogeneous or heterogeneous agents. The interaction event can include the update of environment variables, the adjustment of state attribute variables and the check of transition conditions et. Section 4.2.2 and 4.2.3 will describe the interaction events between agents and the cooperation mode of heterogeneous multi-agents in detail.
Formal modeling of agent interaction events
The event is the interaction and cooperation between multiple agents. In the probabilistic timed automaton model, the event between agents can be implemented using broadcast channels. According to the interaction and cooperation modes between agents in application scenarios, interaction events can be divided into update events and constraint events. Suppose Eu be the set of update events and Ec be the set of constraint events. The initiator of Eu is generally a single agent, and the receiver is a single or multiple agent, which updates the relevant variables of its own state according to broadcast messages. The status of the agent can remain unchanged when the broadcast information of the updated event is sent or received, or the status of the agent can be transferred after the broadcast information is sent or received. Generally, Ec consists of one constraint initiator and one bound party. The initiator of the constraint makes the state of the constrained party not transfer until certain constraint conditions are met, or the state transfer must be carried out after the constraint conditions of the constrained party are met. This is usually accompanied by the Eu event of the bound party.
Suppose s i , s j ∈ S, u ∈ Eu, u ! And u ? represents, respectively, the sending and receiving of update events. The update event area on the left in Fig. 2 displays update events that an agent sends or receives. In state s i , agent sends an update event. After the message is sent, the agent may remain in the state s i or perform the state transition to s j . The same is true for receiving update events. Since message passing is instantaneous, the state s i of agent for event initiation or end needs to set the state attribute of automaton to Committed, which means that no system time is consumed, which also indicates that message passing is an instantaneous process.

Update event and constraint event modeling.
In the process of constraint events, usually accompanied by the sending and receiving of update events, agent state attribute variables also change. Assuming that the constraint between agents is R, Ec ∈ Eu × C, Suppose s i , s j ∈ S, c ! and c ? represents the sending and receiving of constraint events. As shown in the constraint event section on the right in Fig. 2, the agent issues the constraint event c ! at state s j . Receives constraint event c ? in state s m and determines whether its own state meets the constraint condition R. The conditional ternary operators represent the judgment of constraints, and τ and ¬τ represent the result of the judgment. The judgment result of ¬τ indicates that the state stays in s m until the result of the constraint judgment is τ. The state of the automaton that sends the constraint event needs to be set to Committed, indicating that the system time is not consumed. The state of the received message does not need to set the state attribute of the automaton because there is a certain time requirement for the achievement of the constraint conditions.
A SAS in complex scenarios consists of multiple agents. According to the different functional requirements of the system, agents can be divided into multiple structures. agents that achieve the same functional goals have the same cooperation logic. In this paper, the cooperation logic of the agent is expressed by a probabilistic timed automaton model, one agent corresponds to one PTA model.
Suppose AG is a set of all agents in a SAS system, each ag k ∈ Ag, ag k represents an agent with cooperative logic of k, k ∈ N. Suppose the probabilistic time automaton tuple corresponding to ag k be PTA k = (S, s0, T, I, X, V), where the state, initial state, directed edge and state transition probability of time automaton correspond to the state, initial state, state transition and state transition probability of agent cooperation logic respectively. The assignment of cooperation logic expression and time variable sum in agent cooperative logic corresponds to X and V in time automata. The g in cooperative logic is translated into state invariant, trigger condition and action in state transition according to semantics, which corresponds to I in time automaton; The assignment in each state transition of the time automaton corresponds to the assignment of the γ function in the cooperative logic.
For a system composed of a single agent or multiple isomorphic agents, the system has a unified task goal, and the corresponding agent has the same cooperative logic. In this case, the state transfer of the agent cooperative logic is carried out time in sequence, free from complex random causes and interactions with other agents. For the system composed of heterogeneous agents, it can be determined that the system contains at least two or more functional goals, which need to be completed by agents of different structures. The PTA model constructed by the PTA model can form a probabilistic timed automata network. The ways for multiple agents to jointly achieve task goals can be divided into two categories. One is that agents complete task goals successively. As shown in Fig. 3, task goals Task1 and Task2 are completed sequentially by agent1 and agent2 respectively. The agent1 status starts with s i and sends interaction events to the state s j of agent2 after completing Task1. agent2 completes Task2 after state s j receives the interaction event. The other one is initiated by a single agent, and multiple agents cooperate to complete the task goal Task3. As shown in Fig. 4, agent1 and agent2 start execut Task3 in the initial state s i and s m respectively. Agent1 is the initiator and agent2 is the collaborator, both of which require each other’s participation to accomplish the task objectives.

Sequential completion of task objectives.

Collaborate to achieve task objectives.
The main functional behavior of MA-SAS is accomplished by different agents in the system. Multiple agents with different characteristics and functions cooperate and adapt to achieve a variety of tasks and goals, which constitutes a system, that is MA-SAS. Therefore, the PTA model is constructed for each agent in the system. By combining these PTA models according to the task goals, a PTAN model representing the functional behavior of the whole system can be obtained. Load the PTAN model in the model checker. Combined with RQV to quantify the task objectives of the system, the running process of the system can be simulated and the verification results of the task objectives can be obtained at runtime.
Through the agent modeling method mentioned above, the complex heterogeneous multi-agent self-adaptive system can be modeled formally. By constructing the cooperation logic between agents, the agent cooperation mode in the system is preliminarily determined. Specifically, the sequence in which agents execute tasks, the state they are in, when they start executing tasks, and whether other agents are required to achieve certain preconditions, etc. Determining the collaboration mode is to prepare for the subsequent formal description of the cooperation logic. In Section 4.2.2, interactive event modeling summarizes the interaction modes between agents into two categories. In Section 4.2.3, heterogeneous multi-agent modeling summarizes the ways in which different types of agents participate in executing tasks into two categories. And both give the description expression of PTA model language. It can be integrated into the PTAN model of SAS, and there is no model language conflict. It solves the cooperation problem and formalization problem of heterogeneous multi-agent. By running the PTAN model with the collaboration logic in simulation, it can analyze and verify the security of system cooperation logic and the improvement effect of RQV on system reliability.
Application examples and analytical modeling
In this paper, the Intelligent Unmanned Parking System (IUPS) is used as the application background and experimental scene. There are two kinds of agent in the system. one is the Lift robot for loading and transporting vehicles across floors, the other is the automatic parking robot AGV. The parking task is to transport the vehicle across floors to the designated floor by Lift, and then to the target parking space by AGV. When the vehicle needs to be picked up, the AGV transports the vehicle to the Lift, and the Lift transports the vehicle to the ground. Lift and AGV cooperate to complete the process of parking, which involves the cooperation of heterogeneous agents, and consider the impact of environmental factors, the uncertain behavior of agent in this environment is constructed as a probabilistic timed automaton model for verification and analysis.
Probabilistic state model of IUPS
This paper extracts the key states for Lift and AGV from the IUPS. The key states of Lift are shown in Table 2 and the key states of AGV are shown in Table 3.
Key states of Lift
Key states of Lift
Key states of AGV
The PTA model can be obtained by combining the key states of agents with the transition probabilities between faults. Lift is taken as an example here to illustrate that the PTA model of Lift is a six-tuple Lift = (S, s0, T, I, X, V). Nonempty state set S. The state in the model is derived from the system operating state, S contains two types of states S = {N i , F i }, where the state N i corresponds to the normal operating state of the system one by one, and the state F i represents the fault in the operation of the system, which is used to express the random uncertainty. s0 indicates the initial system status. Probabilistic state change set T. Describes the constraint conditions and transfer probability that need to be satisfied for Lift to transfer from state s to state s′. The probability set P contains two types of transfer probability, p i represents the transfer probability under normal operation of the system, q i represents the transfer probability of system failure, and probability 0 is used to indicate that the transfer path cannot pass. Data variable set V. Used to mark the status node attributes of Lift and AGV, accordingly V is a collection of state node attribute variables in the two types of agents.
Combined with the IUPS used to build the system model, the PTA model including Lift and AGV under uncertain conditions can be obtained, again taking Lift as an example, and its model is shown in Fig. 5.In the model, the state N i corresponds to the normal operating process of the system, and the state F1 corresponds to the uncertain fault state which can be transferred to the normal operating state by probability. For N0, as the initial state of the system, there is a probability of p0 transferring to N1 and a probability of q0 transferring to F1. Since N1 and F1 are output states of N0, q0 = 1 - p0. For F1, it represents the failure of the system. In this state, the system has a probability of p5 transferring to the normal state N1, and there is a probability of q5 remaining in the original state to indicate the repair process, q5 = 1 - p5. The normal state means the key state in which the agent is in during its normal operation, and is the node where the agent performs the probability transfer, as shown in Tables 3. Failure state means the state in which the agent is affected by random uncertainties in the environment and malfunctions, and is the opposite of each normal state. the more times the agent transfers between normal states instead of being in a failure state, the better the performance of the agent.

The state transfer model of lift.
The random uncertainty in the environment is regarded as the cause of agent failure, and the Bayes formula is used to calculate [47]. Take the state of failure during Lift operation as the fault set W = {W1, W2, . . . , W n }, take the fault symptom data classification as the symptom set E = {E1, E2, . . . , E m }, where n and m represent the number of fault sets and symptom sets, respectively. A two-layer network is established, the first layer is the fault set, the second layer is the symptom set, and the elements in the sample sets of the two layers are independent of each other. The nodes of the two layers are matched and connected according to the manifestation of mechanical faults, and the network is constructed as shown in Fig. 6.

The double layer network of fault sets and symptom sets.
The posterior probabilities of various faults are calculated by Bayes formula, as shown in Equation (3). The purpose of this method is to estimate the probability of a certain failure by the case that the Agent shows some symptoms, and this probability is regarded as the transition probability between the normal state and the error state of the agent.
The fault data collected during Lift and AGV operation is taken as the prior probability of symptom set, and the prior probability of fault set is 0.1. The posterior probability of each faulty node can be obtained through calculation by the Bayes formula, as shown in Table 4. In this table, q i represents the probability of the system transferring from N i toFi+1, that is, the probability of failure in normal operation. Therefore, the larger the value of q i , the greater the probability of agent failure. It can be concluded from the data in the table that the mean probability of Lift failure is lower than that of AGV. According to q i , the probability value of transferring from N i to Ni+1 during the normal operation of the system can also be obtained, namely p i = 1 - q i . For the self-cycling state with F i as the starting point, the uniform value is qi+4 = 0.1, accordingly the probability of the F i state transferring to N i state is pi+4 = 0.9.
Posterior probability of failure nodes
According to the description of PTAN in Section 3.2, there are two kinds of agents with different functions, Lift and AGV, and the system model can form a probabilistic timed automata network PTAN. The system model is a four-tuple IUPS = ((PTA) , V G , Cl, Ch), where Probabilistic timed automata set PTA includes Lift and AGV PTA models in the system. V G , Cl and Ch are global variable sets, global clock sets and interactive communication channel sets involved in Lift and AGV respectively.The PTAN model is loaded into the model checker, which translates the PTA model into a graphical representation that can execute. The model checker comes with a template class through which developers can implement connections to model elements.
This paper takes the IUPS as an example. We evaluate our HMARV by investigating the following research questions. Determine the applicability of the method in this area and compare the improvements to the system using the method.
To perform our evaluation, we construct the system model of IUPS. The model contains a set of heterogeneous agents and a single agent. The heterogeneous agents are Lift and AGV, which are used to implement the functional behavior of IUPS. A single agent is Controller, which is only used to assist the experiment and facilitate our debugging of the experimental system. We deployed the system model of IUPS on UPPAAL 4.1.19, which is a model simulation verification tool jointly developed by Aaborg University and Appsala University. It has supported the simulation and simulation of probabilistic timed automata and has been used in many case studies. Version 4.1.19 supports SMC method, which can conduct inference verification on the model of the complex real-time system with random semantics [48].
The modified system with RQV is indicated as Lift-RQV and AGV-RQV, and the original system without RQV is indicated as Lift-original and AGV-original. In UPPAAL, each simulation time corresponds to 1s of reality. Using 500 simulation times as a unit time, the results of the two types of agents running in 10 unit times are obtained, as shown in Fig. 7. The accuracy of both types of agents has been improved after adopting RQV compared to the agent without RQV. In terms of accuracy, Lift had about a 23% improvement and AGV had about a 26% improvement. As AGV encounter more faults, their accuracy rate fluctuates more, which is a more obvious improvement after using RQV. In cases where collaboration is not involved, the using RQV does improve the accuracy of the Agent running in a normal state.
To further reflect the role of RQV. We define the number of the agent task goal is completed over a period of time as the evaluation criterion. The higher the number of completions, the better the system performance. Using 100 simulation times as a unit time, the results of agents running in 10 unit times are derived, as shown in Fig. 8(a) and 8(c).

Verification results of normal time proportion of single agent.

Verification result of task completion times of single agent.
It can be seen that the number of task goals completed by the agent after adopting RQV has increased significantly in the same time period compared to the agent without RQV. The gap between Lift-RQV and Lift-original gradually widens as the runtime grows, and the same trend is observed between AGV-RQV and AGV-original. This means that an agent with RQV applied can accomplish more task goals at the same time, and as time grows, the effect of RQV enhancement will become more and more obvious. Figure 8(b) and 8(d) manifest the difference in the number of task goal completions, which can be seen more visually as the difference increases over time. In cases where collaboration is not involved, the using RQV does improve the efficiency of the agent in completing tasks, and the longer the run time, the more significant the improvement.
We combined the PTA models of Lift and AGV into a PTAN model and loaded it into UPPAAL. The PTAN model can simulate the operation of IUPS. Through the key states described in Tables 3 in Section 5.1, abstracting the cooperation logic of agents in IUPS as security constraints, as shown in Table 5. Applying our proposed agent cooperation logic formalization method, the security constraints in the table are formalized as the interaction event model in Section 4.2.2. The first 3 constraints in the table are modeled as constraint event models, meaning that the agent cannot change its state until the constraint is satisfied. During operation, a constraint event occurs along with a change in the agent state properties. Models changes in agent state properties as update events. Thus constraints 4 and 5 in the table can be modeled as update events.
Security constraints for Lift and AGV cooperation
The formal model of the above constraints is added to the PTAN model of IUPS. To ensure that the model can properly simulate the autonomous operation of IUPS in an unmanned environment, the correctness of the model’s system logic and time sequence logic needs to be verified. The constraints of Table 5 are expressed in the TCTL formula and the results are verified in UPPAAL, as shown in Table 6. The results in the first two rows of the table are the state reachability verification of Lift and AGV. It is proved that all key states of Lift and AGV are in normal reachable state in the model. The properties 1 to 5 in the table correspond to the serial numbers in Table 5. If the result of the verification property is satisfied, it means that the security constraint is correct. The state transfer process of PTAN is restricted to the safe range. We use this to judge the correctness of the security constraint.
Verification of system logic and time sequence logic correctness
After getting the correct agent cooperation logic. We simulate the system trace of Lift and AGV collaborating to complete a parking task to further verify the effectiveness of the cooperation logic. The simulation results are shown in Fig. 9. The Controller is used only for auxiliary experiments, for the start and end of the control system, and does not affect the experimental results. The numbers 0 to 5 in the figure correspond to the serial numbers of Tables 3 in Section 5.1 and represent the states that Lift and AGV are in. As can be seen in Fig. 9, the completion of a parking task, the paths of the two types of agent state transfer in the system will be strictly in accordance with the formulated security constraints. It shows that the cooperation logic of Lift and AGV can keep the operation of IUPS stable. Figure 9 simulates the path to complete the task only once. Lift and AGV will return to the initial state and end the operation after completing the task goal, and will reach the position of the black rectangle in the figure. If the system continues to run, it will enter the system preparation process again. The system returns to the initial state and starts running. The black rectangle in the figure will change to a white rectangle, the subsequent system trace is consistent with the figure. It shows that the formal approach of our agent cooperation logic can indeed improve the stability of state transfer during SAS operation.

The system trace of Lift and AGV collaborating to complete a parking task.
sys-RQV indicates an improved system that uses RQV, and sys-original indicates an original system that does not use RQV. The verification results of IUPS are shown in Fig. 10. The accuracy of the system with RQV applied has been better than the original system and has been stable at about 90%, with an improvement of about 21%. This manifest the use of RQV can improve the stability of SAS operation in SAS where the formal method of agent cooperation logic is applied. SAS is able to run longer in its normal state to satisfy more task requirements.

Verification of normal time proportion of system.
By comparing the performance of IUPS in terms of the number of tasks completed, the results shown in Fig. 11 were obtained. From the figure, it can be seen that the number of task goals completed by IUPS after the adoption of RQV has improved significantly compared to the original system without RQV. The gap between sys-RQV and sys-original gradually widens as the running time increases. This means that sys-RQV is able to accomplish more of its task goals over the course of a long run. It is shown that the use of RQV can improve the operational efficiency of SAS in a formal method of applying agent cooperative logic. This is a intensely important enhancement to IUPS for long runs in unmanned environments.

Verification result of task completion times of system.
Our method can adopted in a large number of critical scenario such as software systems, embedded systems, distributed systems, real-time security verification. Most scenarios that require formal verification can apply the methods in this paper, but model checking suffers from the problem of state space explosion, which can grow exponentially for large-scale systems or complex structures. This makes it formidable to perform comprehensive model checking for large-scale systems. In the experiments of the IUPS case. We firstly verifies the security of the multi-agent cooperation logic in the system. By developing security constraints, the correctness and stability of state transfer during the system operation can be ensured. But the security verification of cooperative logic has a slice of limitations in the large-scale SAS scenario with complex functions. As the functional behavior of SAS increases, the number of heterogeneous agents also increases, and there may be more interaction events and security constraints, which are positively correlated with the cost of verification. Therefore, in large and complex SAS scenarios, whether simple and efficient security constraints and interaction logic can be formulated is an crucial factor that determines the cost and efficiency of verification.
In this paper, through multi-group comparison experiments and comparative perspective, it can be concluded that RQV has a greater improvement in single or multi-agent experiments than without using RQV. The experimental results manifest that the HMARV method proposed in this paper is feasible in heterogeneous multi-agent systems, and the system reliability is greatly improved after using this method. However, the experimental results are related to the actual data in SAS scenarios. If the actual data cannot reflect the actual SAS situation, the reliability of the experimental results may be affected to a slice extent.
Summary and prospect
This paper proposes self-adaptive runtime verification method for heterogeneous multi-agent based on probabilistic timed automata model. In order to solve the complex SAS modeling problem, agents with various functional structures are modeled as time automata model. Random variables from the internal and external environment are then abstracted as the transfer probability of the system model, causing the time automata model to undergo probabilistic random changes and become a probabilistic timed automata model. In this way, the PTAN composed of multiple PTA models can simulate the operation process of the SAS under the random environment. Aiming at the verification problem of heterogeneous agents in the system, the cooperation logic between agents is first formally described, and then the corresponding security constraint is formulated. Ensure that the state transfer of the system is maintained in a safe state, thus verifying the correctness and reliability of the system agent cooperation logic. Finally, the probabilistic model checking technique is used to quantitatively verify the functional requirements, and the comparison with the original system proves the effectiveness of the proposed method.
Our method aims to ensure the following key properties:
The limitations of the proposed method lie in the following points. The process of runtime verification is added during the operation of the system, while the verification analysis requires an additional computational process, which is proportional to the size of the model. If the number of agents and logic is large, it may lead to state space congestion generated by verification, reduce the decision-making efficiency of the system, and increase the operation cost of the system. In the process of system operation, if there are major changes in the environment, the model structure may partially fail. At this time, the system model needs to be redesigned and updated, reducing the efficiency of verification. In the real-time environment will inevitably encounter the following limitations, agent execution time and the real environment there is a difference; agent’s own equipment mechanical failure the inability to simulate the real environment in the unplanned all the complexity of the situation, and so on.
Future research will focus on the above problems and study how to optimize the state space of the model, improve the validation efficiency and apply it to larger and more complex SAS. Study the updating method of the model, consider the introduction of human factors, add human influence into the system control cycle, and maintain the adaptability of the model to the internal and external environment. Or study the dynamic updating method of the model, so that the system can monitor the model problems caused by major environmental changes, update and repair the model itself, and ensure the reliability of the verification results.
Footnotes
Acknowledgment
This work is supported by the National Natural Science Foundation of China (62072350, 62102292) and the 14th Graduate Education Innovation Fund Project of Wuhan Institute of Technology (CX2022339, CX2022322).
