Towards a preventive maintenance approach for multi-agent applications

Abstract

Software preventive maintenance is an important software activity which consists to include modifications and updations in order to prevent future serious issues of the software. In the context of multi-agent systems, such activity is completely omitted. The inherent specificities of multi-agent systems (e.g., autonomy, pro-activity, reactivity, adaptability, etc.) make their maintenance difficult task to achieve. We propose, in this paper, a conditional preventive maintenance approach for multi-agent applications. The proposed approach is based on MAS quality measurements and uses Aspect-Oriented programming. The proposed approach consists of three major steps, (i) measuring two quality metrics (autonomy and sociability) of the running application in a dynamic and continuous way by using AspectJ code and comparing them with minimum thresholds previously defined by the designer, (ii) warning the maintainer in case of detection of abnormal regression of the MAS quality, and (iii) intervention of the maintainer for preserving the application quality and thus avoid potential damage. The proposed approach is supported by a software tool we developed called PMMAS (Preventive Maintenance of Multi-Agent Systems). The approach and the associated tool are illustrated using a concrete case study.

Keywords

Conditional preventive maintenance quick-fix model multi-agent systems aspect-oriented programming quality measurement.

1. Introduction

The multi-agent paradigm was used, for many years, for the development of complex systems because it conveys many rich concepts characterizing agents such as autonomy, reactivity, robustness and pro-activity, etc. During the two last decades, the multi-agent software development has attracted a lot of interest both in academic and industrial sectors. Software maintenance is an important and crucial task in software development life cycle. Although it accounts for more effort than any other software engineering activity, software maintenance is still a neglected phase in the software engineering process [13]. There are four maintenance categories: corrective, adaptive, perfective, and preventive. This latter is considered as maintenance performed for the purpose of preventing problems before they occur [5]. Software preventive maintenance may be systematic, forecasting, or conditional. Furthermore, several software maintenance models have been proposed, which include quick fix model, iterative enhancement model, and reuse-oriented model. In the context of present work, we are interesting in conditional preventive maintenance based on quick-fix model for two essential reasons. Firstly, conditional preventive maintenance is experience-dependent maintenance and involves information gathered in real time. Secondly, the quick-fix model is used to identify the problem and then fix it as quickly as possible. Its advantage, compared to the two other models, is that it performs its work quickly and at a low cost.

In the literature, few research works [2, 7, 8, 9, 13] have been done on preventive maintenance. However, to our best knowledge, no work dealing with preventive maintenance of multi-agent systems. In this paper, we propose an original approach for multi-agent systems conditional preventive maintenance. Based on MAS quality measurements and uses aspect-oriented programming, the proposed approach consists of three major steps: (i) measuring two quality metrics (autonomy and sociability) of the running application in a dynamic and continuous way by using AspectJ code and comparing them with minimum thresholds previously defined by the designer, (ii) warning the maintainer in case of detection of abnormal regression of the MAS quality, and (iii) intervention of the maintainer for preserving the application quality and thus avoid potential damage. Furthermore, the proposed approach is supported by PMMAS (Preventive Maintenance of Multi-Agent Systems), a tool we developed.

The remainder of this paper is organized as follows. In Section 2, we give a brief overview of major literature. Section 3 presents preliminaries for the proposed approach. We present in Section 4 the proposed approach. Section 5 illustrates the tool we developed using a concrete case study. Section 6 gives some conclusions and future work directions.

2. Related work

In the literature, only around 4% of the total maintenance efforts are attributed to software preventive maintenance. During the last years only few approaches [2, 7, 8, 9, 13] have been proposed in order to deal with the software preventive maintenance issue.

Cheluvaraju et al. [2] have proposed a software quality metric called “The Preventability Metric” that measures the preventability of defects in software. The metric is derived from a composite quantitative evaluation of the efficiency and effectiveness of the individual preventive techniques employed on software before its deployment. It provides a confidence on how well prevention of defects is handled before deployment.

Vaidyanathan et al. [7] present an analytical model of a software system employing inspection-based preventive maintenance, through a markov regenerative process with a subordinated semi-markov reward process. While dealing with the phenomenon of software aging, the authors wanted to show that inspection-based preventive maintenance is advantageous in many cases over non-inspection-based preventive maintenance.

Sun and Wang [8] studied preventive software maintenance policy based on ant colony algorithm. The entire system was divided into several subsystems and each subsystem has four kinds of maintenance policies with different maintenance cost. This model can obtain optimal preventive maintenance policy of each subsystem which guarantees excellent software system reliability with relatively lower costs.

Garg et al. [9] present an analytical model of a software system which serves transactions. The authors have considered three measures, the availability of the software to provide service, the probability of loss of a transaction, and the response time of a transaction. This model has been proposed in order to counteract the phenomenon of software aging and avoid overhead that may be incurred. Therefore, it was be necessarily to follow an analysis based approach in order to determine the optimal times to perform preventive maintenance.

Singh and Goel [13] have tried to analyze the issues governing software maintenance and how preventive maintenance can help the software product age usefully. They also suggested a model for the preventive maintenance integrated within software life cycle. The proposed model is based on a classification which identifies three kind of activities which be done in software preventive maintenance, namely, correction, adaptation and perfection. Hence, any kind of maintenance request requires a preventive corrective maintenance, preventive adaptive maintenance or preventive perfective maintenance.

As mentioned above, all these approaches are concerned with software preventive maintenance. However, none of them deals with preventive maintenance of existing agent-oriented codes. We present, in this paper, an original approach for conditional preventive maintenance of JADE applications. The proposed approach is based on MAS quality measurements and uses Aspect-Oriented programming. It is supported by an aspect-based visual tool.

3. Preliminaries

We introduce, in this section, some basic concepts and tools related to the proposed approach, namely, JADE [3], AspectJ [4] and Cause-Effect diagram [10].

3.1 JADE platform

JADE (Java Agent DEvelopment Framework) [3] is a software Framework fully implemented in Java language. It simplifies the implementation of MAS through a middle-ware that complies with the FIPA specifications and through a set of graphical tools that support the debugging and deployment phases. The agent platform can be distributed across machines (which not even need to share the same operating system) and the configuration can be controlled via a remote GUI. JADE is currently the most widely used platform for research purpose. It has three main modules:

•
The DF (Directory Facilitator) provides yellow pages services to other agents. Agents may register their services with the DF or query the DF to find out what services are offered by other agents, including the discovery of agents and their offered services in ad hoc networks.
•
The AMS (Agent Management System) controls access and use of the agent platform and provides services like maintaining a directory of agent names. It provides white page services to other agents. Each agent must be registered with an AMS.
•
The ACC (Agent Communication Channel) manages the communication between the agents.

3.2 AspectJ

AspectJ [4] is a seamless aspect-oriented extension to Java. It enables a different way to further and cleaner modularize all concerns in a complex system compared to the object-oriented paradigm [4]. AspectJ adds to Java several new constructs, including join points, pointcuts, advice, intertype declarations, and aspects. An aspect is a modular unit of crosscutting implementation in AspectJ. Each aspect encapsulates functionality that may crosscut several classes in a program. Join Points are well-defined points in the execution of the program, such as, method call (a point where a method is called), method execution (a point where a method is invoked) and method reception join points (a point where a method received a call, but this method is not executed yet). Pointcuts are a means of referring to collections of join points and certain values at those join points. Advice is a method-like constructs used to define additional behavior at join points. It consists of instructions that execute before, after, or around a join point. The around advice executes in place of the indicated join point, which allows the aspect to replace a method. An aspect can also use an intertype declaration to add a public or private method, field, or interface implementation declaration into a class [12].

AspectJ has two types of weaving: a static weaving where it provides all the codes corresponding to the join points declared in a pointcut without execution, and a dynamic weaving where only the executed joinpoints are intercepted at runtime. AspectJ plugin is an aspect-oriented extension to the Java programming language. AJDT (AspectJ Developpement Tools) is the name of this AspectJ Plugin.

3.3 Quick-fix model

The quick-fix model [1] (Fig. 1) is an ad hoc approach used for maintaining the software system. The purpose of this model is to identify the problem and then fix it as quickly as possible. Using this model allows also avoiding the time consuming process of Software Maintenance Life Cycle.

Figure 1.

Quick-Fix model.

Figure 2.

Example of Cause-Effect diagram [6].

3.4 Cause-Effect diagram

A Cause and Effect diagram [10] (Fig. 2) is a graphical tool used for a cause and effect analysis, where you try to identify possible causes for a certain problem or event. The purpose of a cause and effect analysis is to identify the causes, factors, or sources of variation that lead to a specific event, result, or defect in a product or process [6].

4. The proposed approach

The approach we propose (Fig. 3), in this paper, deals with the conditional preventive maintenance for multi-agent applications. This approach is used to continually measure some quality metrics using the control code written in AspectJ and compare it with minimum thresholds previously defined by the designer. When the measured values are lower than those specified by the designer, a preventive intervention must be done to make the system in its desired operating state.

It is to highlight that the quality metrics relate to different attributes of multi-agent systems such as: autonomy, sociability, reactivity, pro-activity, rationality, adaptability. In this paper, we are interesting in the two attributes that most influence the quality of multi-agents software, namely, autonomy and sociability.

Figure 3.

The methodology of the proposed approach.

4.1 Measured quality metrics

For the validation of our approach, we choose two essential metrics of multi agent systems, the autonomy and the sociability.

Autonomy: in order to measure agents autonomy in JADE applications, we used the following metric formula proposed by Marir et al. [11]:

$\displaystyle\text{Agent's autonomy}=1-(\text{RS/EB})$ (1)

where RS is the number of requests for services. EB is the number of executed behaviors.

Sociability: It is the degree of ability of an agent interacting with others to meet their goals. Sociability can be measured by several metrics. In our work, we consider that all communication action is part of sociability. The formula used to measure sociability is as follows:

$\displaystyle\text{SOCIABILITY}=1-(\text{MN/EB})$ (2)

where MN is the number of sent messages. EB is the number of executed behaviors.

According to Toufik Marir et al. [11], in the two preceding formulas we assume that the agent does not make more than one request for each executed behavior to ensure the requirements of the normalization of values in the interval [0,1].

5. Environment supporting our approach

We developed, along this work, an environment (composed of several tools) supporting the conditional preventive maintenance of JADE applications. Baptized PMMAS (Preventive Maintenance of Multi-Agent Systems), this environment is developed using the Rich Client Platform (RCP) under Eclipse. The main advantage of this platform is its extensible architecture which allows the integration of other Plugins. In our case, we integrated the JDT (Java Development Tools) and the AJDT (AspectJ Development Tools). PMMAS offers all the functionalities which allow the user to code his program (using JDT Plugin) and makes it under continuous control (using AJDT Plugin). PMMAS is independent of Eclipse IDE, and it can be deployed as a standalone application or as a plugin.

5.1 Case study: Simulation of a car production system

To validate our approach, we used a concrete example of a multi agent system. The example represents a simple simulation of a car production system. In this example, there are several production units of the various components of the car with a main unit whose role is the assembly of these components. Each production unit has a stock of limited size. The main unit has several stocks: one stock for each component and one stock for the final product (cars).

Figure 4.

The contract net protocol.

The main unit always tries to ensure the proper functioning of the production process by filling the stocks of the various components to avoid stock-outs, and to commercialize the final product in the stock to avoid saturation. The main unit also seeks to obtain the different components with better offers by using negotiation mechanisms. In our case, we use the Contract-Net protocol to manage the negotiation between agents.

Figure 4 illustrates a protocol of interactions between agents. It describes, using an AUML sequence diagram, the Contract Net protocol. When invoked, the Initiator (CarProducer) agent sends a call-for-proposal to a Participant (ComponentProd) agent. Before a given deadline, the participant agent can submit to the initiator agent a proposal (propose), refuse to submit a proposal (refuse), or indicate that it did not understand (not-understood). The proposal formulated by the participant agent can either be accepted or rejected by the initiator agent. When it receives a proposal acceptance (accept-proposal), the participant agent informs the Initiator agent about the proposal’s execution. However, if it cannot fulfill its engagement, it informs the Initiator through a cancel (Failure) message.

The graphical user interface of our case study as presented in Fig. 5 is subdivided into two sections, on the right side there are the different roles of agents, as well as two buttons to set and launch the simulation. The left side contains the instances of different agents, as well as the stock’s amount of different components for each production unit.

Figure 5.

The main interface of PMMAS.

Once the simulation is launched, minimum threshold’s setting window will appear (Fig. 6). After a simple static analyze on our study case we have found that the system is in a healthy state as long as the measured metrics are superior to the thresholds presented in Fig. 6.

Figure 6.

The minimum threshold’s setting window.

Figure 7.

Agents stocks sizes and wave shipment size.

Similarly, we have to define the different agents’ stocks sizes as well as the wave shipment size (Fig. 7).

5.1.1 Preventing agents’ autonomy regression

To demonstrate the effectiveness of our tool and therefore our approach, we deliberately used a “fault-injection testing” which consists of injecting the cause of an error and wait for the appearance and detection of that error .The Causes of error in our study case are modeled as cause-effect diagrams. The Cause-Effect diagrams (Figs 8 and 15) present some causes of error of our case study relating to agent autonomy and sociability respectively.

Figure 8.

Cause-Effect diagram of decreased autonomy.

Figure 9.

The autonomy measurements of the CarProd agent without error.

As we quoted above, we use the Aspect paradigm in order to measure agent’s autonomy. The portion of code presented in Snippet 1 consists in calculating the number of Requests for Services (RS in Eq. (1)) asked by the agent CardProducer.

Snippet 1: AspectJ code for calculating RS.

The portion of code presented in Snippet 2 consists, on the one hand, in calculating the number of the behaviors executed by the agent CardProducer (EB in Eq. (1) as well as the value of its autonomy, and on the other hand, displaying a message indicating the regression of CardProducer’s autonomy when it becomes inferior to the threshold (0.82 see Fig. 6).

Snippet 2: AspectJ code for calculating EB and CarProducer’s autonomy.

Figure 10.

Change in wave shipment size.

Figure 9 shows the curve of the continuously measured autonomy of the CarProd agent in normal situations and its threshold.

In Fig. 10, we have changed the size of the shipment wave (from 20 to 2). This modification is one of the causes of the decrease in autonomy as presented in Fig. 11.

After injecting the cause of the error, the autonomy of the agent CarProd will fall under the thresholds (0.82) defined previously by the designer (Fig. 11) and a preventive alert will be launched (Fig. 12).

Figure 11.

The CarProd agent’s autonomy measurement after fault injection.

Figure 12.

Warning of autonomy degradation.

Figure 13.

Augmentation in size of wave shipment.

Figure 12 presents the generated alert when the autonomy becomes just inferior to the threshold (0.82) previously defined by the designer (regression of CarProd autonomy to 0.8166).

To return the system to its normal state we need to modify the application Set-up of our case study as we have shown in Fig. 10. In this case, we have changed the size of the shipment wave to 50 as presented in Fig. 13. Figure 14 represents the curve of the measured autonomy of the CarProd agent before and after the modification (intervention).

Figure 14.

The CarProd agent’s autonomy measures with correction.

5.1.2 Preventing sociability regression

Figure 15 presents some of the causes of decreased sociability between agents using a Cause-Effect diagram.

Figure 15.

Cause-Effect diagram of decreased sociability.

In the same way as it is done with agent’s autonomy, Snippet 3 presents the portion of AspectJ code which consists of calculating the Number of Messages (MN in Eq. (2)) exchanged between agents of the car production system.

Snippet 3: AspectJ code for calculating MN.

Snippet 4: AspectJ code for calculating EB and sociability.

The portion of AspectJ code presented in Snippet 4 consists in calculating the number of behaviors performed by all the agents of the car production system. Furthermore, it calculates the value of sociability and displays it in the graphical interface.

Figure 16.

Suspension of the buyer agent.

One of the causes of the decreased sociability in our study case is the absence of purchase transactions that leads to the saturation of stocks and consequently the cessation of the production process. In Fig. 16, we have stopped the agent who simulated the role of buyers.

Figure 17 shows the sociability curve of the agents before and after injecting the cause of error.

Figure 18 presents the generated alert when the sociability is inferior to the threshold (0.045) previously defined by the designer (regression of sociability to 0.025).

To return the system to its normal state we can create a new client-agent, motivate existing clients to increase their purchases or resume our suspended Client-Agent. In our case, we have created new client-agent as presented in Figs 19–21.The result of the correction is presented in Fig. 22.

Figure 17.

Sociability measurements after fault-injection.

Figure 18.

Warning of sociability degradation.

Figure 19.

Creating new agent.

Figure 20.

Launching the new agent.

Figure 21.

The new client appears in the container.

Figure 22 represents the curve of the measured agents’ sociability before and after the modification (intervention).

Figure 22.

Sociability measurements before and after the correction.

6. Discussion

As it is quoted above, a few researches have been done on software preventive maintenance. These researches have provided some interesting solutions to different problems in different contexts. However, none of them deals with preventive maintenance of existing agent-oriented codes.

The approach we proposed, in this paper, is the first step towards proposing a generic preventive maintenance approach for multi-agent applications by taking into account all quality attributes of multi-agent systems defined in the quality model we proposed in [11].

In this paper, we have only taken into account two attributes of mult-agent system: autonomy and sociability because they are considered as the most influential attributes on the quality of multi-agents. The obtained results seem satisfactory and reliable. However, the limit of our approach is its semi-automatic nature. This shortcoming is the main cause of the loss of time for the system under control. In its current version, the approach is not well suited to real-time multi-agent applications.

7. Conclusion and future work

Preventive maintenance of multi-agent systems is a new area of research. It has not been explored yet. Only some proposals are concerned with software preventive maintenance. We presented, in this paper, an original approach for MAS conditional preventive maintenance that is based on MAS quality measurements and uses Aspect-Oriented Programming.

Our approach is supported by a visual tool called PMMAS (Preventive Maintenance of Multi-Agent Systems), which has been validated on a JADE application implementing a car production system.

As future work, we plan to evaluate our approach and associated tool on other case studies and extend our tool for supporting other intervention manners. We are working on using multi-agent systems reorganization as intervention technique and machine learning in order to preventing future serious issues of the application under maintenance. Using machine learning for preventive/predictive maintenance of MAS is an interesting way that could improve the maintenance process efficiency.

Footnotes

Authors’ Bios

	Nawel Ghrieb is an Associate professor at the Department of Mathematics and Computer Science of the University of Tebessa in Algeria. She is a member of DISE (Distributed-Intelligent Systems Engineering) team at ReLa(CS)2 (Research laboratory on computer science complex systems) Laboratory at the university of Oum el Bouaghi. Her areas of interest include agent-oriented software engineering and software maintenance.
	Farid Mokhati is a professor of Computer Science at the Department of Mathematics and Computer Science of the University of Oum El-Bouaghi in Algeria. He holds a University accreditation (Habilitation Universitaire) in Computer Science (Distributed Artificial Intelligence) awarded by BADJI Mokhtar University (Annaba) in Algeria. Currently, he is the head of ReLa(CS)2 (Research laboratory on computer science complex systems) Laboratory and the head of DISE (Distributed-Intelligent Systems Engineering) team. His main areas of interest include object and agent-oriented software engineering, embedded systems and formal methods.
	Mostafa Aouar Ghorab is a Phd Student at the Departement of Computer Science of the University Ferhat Abbas Setif1 in Algeria. He holds master degree in computer science (Distributed Architectures) from the University of Oum El Bouaghi. His areas of interest include object and agent-oriented software engineering.
	Tahar Guerram is an assistant professor of Computer Science at the Department of Mathematics and Computer Science of the University of Oum El-Bouaghi in Algeria. He holds a University accreditation (Habilitation Universitaire) in Computer Science (Distributed Artificial Intelligence) awarded by Larbi ben M’hidi University (Oum El Bouaghi) in Algeria. He is a member of DISE (Distributed-Intelligent Systems Engineering) team at ReLa(CS)2 (Research laboratory on computer science complex systems). His areas of interest include multi-agent systems, machine learning and complex systems.

References

Penny

A.G.

and Takang

A.A.

, Software maintenance – concepts and practice, in: (2. ed.). World Scientific, ISBN 978-981-238-426-3, pp. I-XIX, 1-349 (2003).

Cheluvaraju

Pasala

Padmanabhuni

and Chevireddy

, A quantitative measure for preventive maintenance in software, in: ACM SIGSOFT Software Engineering, Notes 37(4) (2012), 1–5.

Bellifemine

F.L.

Caire

and Greenwood

, Developing Multi-Agent Systems with jade. John Wiley & Sons, 2007.

Kiczales

Hilsdale

Hugunin

Kersten

Palm

and Griswold

W.G.

, An Overview of AspectJ, in: Knudsen

J.L.

(Ed.), ECOOP 2001 — Object-Oriented Programming, Lecture Notes in Computer Science, Springer Berlin Heidelberg, 2001, pp. 327–354.

ISO/IEC/ IEEE 24765, Systems and software engineering – Vocabulary, 2010.

Wittwer

J.W.

, Fishbone Diagram/Cause and Effect Diagram in Excel, From Vertex42.com. Oct 29, 2009, https://www.vertex42.com/ExcelTemplates/fishbone-diagram.html.

Vaidyanathan

Dharmaraja

and Trivedi

K.S.

, Analysis of Inspection-Based Preventive Maintenance in Operational Software Systems, in: Proceedings of 21st IEEE Symposium on Reliable Distributed Systems, 2002, pp. 286–295.

Sun

and Wang

, Application of ant colony optimization in preventive software maintenance policy, in: Proceedings of IEEE International Conference on Information Science and Technology, Cihna, Mar 2012.

Garg

Puliafito

Telek

and Trivedi

K.S.

, Analysis of preventive maintenance in transactions based software systems, IEEE Trans. Computers 47(1) (1998), 96–107.

10.

Ron Kenett

, Cause and Effect Diagrams, in: Encyclopedia of Statistics in Quality and Reliability? First published: 15 March 2008.

11.

Marir

Mokhati

Seridi-Bouchelaghem

Acid

and Bouzid

, QM4MAS: a quality model for multi-agent systems, IJCAT 54(4) (2016), 297–310.

12.

Xie

Zhao

Marinov

and Notkin

, Detecting Redundant Unit Tests for AspectJ Programs, in: Proceedings of 17th International Symposium on Software Reliability Engineering, 2006, pp. 179–190.

13.

Singh

and Goel

, A step towards software preventive maintenance, in: Proceedings of ACM SIGSOFT Software Engineering Notes 32(4) (2007).