Abstract
The commissioning of a new guided or automated rail transport system requires an in-depth analysis of all the methods, techniques, procedures, regulations and safety standards to ensure that the risk level of the future system does not present any danger likely to jeopardize the safety of travelers. Among these numerous safety methods implemented to guarantee safety at the system, automation, hardware and software level, there is a method called “Software Errors and Effects Analysis (SEEA)” whose objective is to determine the nature and the severity of the consequences of software failures, to propose measures to detect errors and finally to improve the robustness of the software. In order to strengthen and rationalize this SEEA method, we have agreed to use machine learning techniques and in particular Case-Based Reasoning (CBR) in order to assist the certification experts in their difficult task of assessing completeness and the consistency of safety of critical software equipment. The main objective consists, from a set of data in the form of accident scenarios or incidents experienced on rail transport systems (experience feedback), to exploit by automatic learning this mass of data to stimulate the imagination of certification experts and assist them in their crucial task of researching scenarios of potential accidents not taken into account during the design phase of new critical software. The originality of the tool developed lies not only in its ability to model, capitalize, sustain and disseminate SEEA expertise, but it represents the first research on the application of CBR to SEEA. In fact, in the field of rail transport, there are currently no software tools for assisting SEEAs based on machine learning techniques and in particular based on CBR.
Introduction
The basic principle of case-based reasoning (CBR) is to deal with a new problem by remembering similar experiences which have occurred in the past. The objective of the study is the development of CBR system to help safety experts judge the completeness and consistency of Software Error Effect Analysis (SEEA). The objective of this study is to exploit a case base formed by historical SEEA (source case), carried out on already validated and certified software, in order to explain or evaluate a new case of SEEA on new software (target case) and therefore help and stimulate the imagination of experts in the field in the search for new critical situations contrary to safety that requires the implementation of safety barriers or instructions and adequate preventive measures. This article is organized around seven major paragraphs. The first paragraph presents the main methods of railway safety analysis and in particular the SEEA method which is the subject of this manuscript. The objective of our study as well as the approach adopted for the development of an aid tool for the analysis and evaluation of the knowledge involved in the SEEA method are detailed in the second paragraph. We demonstrate that the chosen approach requires the use of AI techniques and in particular the joint use of conventional knowledge acquisition approaches and more formal methods of automatic learning.
The third paragraph is devoted to an analysis of the literature on AI techniques; emphasis is placed on the CBR. This same paragraph presents several examples of CBR applications for rail transport safety. This bibliographic study enabled us to position, in paragraph four, our contribution with respect the state of the art. The fifth paragraph finally proposes a new method of assessment of critical software safety based on the CBR. In order to demonstrate the feasibility and appropriateness of the proposed method, the sixth paragraph presents an example of application which is based on 224 SEEA cases from the knowledge acquisition phase of already certified rail transport systems and commissioning in France. The results obtained to date are presented in the last paragraph.
Scientific and regulatory context
In order to examine the advantages and disadvantages of existing approaches regarding the treatment of SEEA in general and for railway systems in particular, a detailed review of the state of the art was carried out. This bibliographic study unfortunately raised the problem of confidentiality of knowledge related to rail accidents and incidents. Indeed, for reasons of industrial secrecy, work on the application of the SEEA method to the safety of critical software is extremely rare or even non-existent. Fortunately, the standards are there to provide support to verify and validate the dependability of the software. These standards provide a methodological framework for controlling the completeness of software code, verifying software test coverage, analyzing the impact of software modifications and finally Analyzing software failure modes and their effects on the system (SEEA). In France, the SEEA method was initiated and implemented in the field of safety of software used in the rail transport sector as attested by the French standard NF F 71013 (Dependability of software) established in 1990 [1]. This standard was subsequently deleted in 2011 and replaced by the European standard EN 50128 [2]. However, there are several works on the application of the FMECA method (Failure Modes, Effects and Criticality Analysis) to the analysis of software reliability and safety. Indeed, it is important to note that the SEEA method is a derivative of the FMECA method, but adapted to software development. Initially the FMECA method was widely used for material equipment. This technique can also be encountered for software errors effects analysis [3]. In France, the SEEA method was applied in the context of the development of several rail transport systems, in particular the MAGGALY system (line D of the Lyon Metro) put into service in France in 1992 and the TVM-430 system (system of railway signaling in the cabin) of the LGV Nord-Europe (French high-speed line) put into service in 1993. As part of the feasibility study on the application of CBR to software safety assessment, we used the safety files of these two systems (MAGGALY and LGV Nord) and we identified 224 cases of SEEA (or insecurity scenarios).
Positioning of the study in the experience Feedback of Experience process (Rex)
The harmful consequences of railway accidents, which sometimes lead to loss of life and destruction of the system and its environment, are the basis of the establishment of a “Feedback of Experience” (REX) system considered as the means essential to promote the improvement of railway safety. It is therefore necessary to set up a REX process to memorize and capitalize all accidents and incidents and therefore at least avoid the reproduction of new similar accidents. Our research focuses on the implementation of methods and tools to assist safety experts and possibly certification bodies in their crucial task of analysis and evaluation of safety studies [4, 5, 6, 7, 8]. In the field of accidentology and safety, the analysis of accidents and incidents theoretically allows continuous improvement of safety. For a rail operator, the objective of REX is to improve the level of safety in its operation by taking advantage of negative past events (accidents, serious incidents, near-accidents, etc.) or positive events (good practice, reference, etc.). The REX aims not only to reduce in number and/or severity, the dysfunctions of the system (people, installations, procedures, environment), but also the implementation of the most effective measures to control the risks related to the life cycle of the systems railways (design, construction, operation, maintenance). The main objective of REX is to learn from lived experience to avoid its reproduction. In the face of a risky situation, the REX, as a process of acquiring knowledge and learning, makes it possible not only to identify knowledge, but also to share it between the actors concerned. It is an approach which aims to highlight the shortcomings, dysfunctions and incompatibilities of the safety system and to formulate proposals making it possible to avoid such situations or to reduce their consequences.
In artificial intelligence, CBR focuses mainly on problem solving based on experience. It is a cognitive process of human reasoning that is highly dependent on how people acquire a new skill based on their past habits and experiences. CBR means using and exploiting old experiences to understand, explain, interpret or resolve new situations similar to similar past situations. CBR emphasizes the role of past experience in solving future problems. New problems are solved by retrieving and adapting solutions to similar problems, solutions that have been stored and indexed for future reuse. CBR is a technology based on the idea of analogy: Solutions to past problems (cases) can be recovered and deployed, with adaptation if necessary, to solve new problems.
The approach proposed in this manuscript for the prevention of railway accidents is based on the CBR. From manufacturers’ files relating to the safety analysis of critical software and in particular SEEA documents. The aim is to collect historical knowledge on railway safety and in particular the scenarios of potential accidents relating to problems of train collision or derailment. These historical data come from the examination of two rail transport systems already certified and put into service in France. Faced with a new given safety problem (potential accident risk), the CBR system uses historical data in order to find the cases closest to this new accident risk and proposes the appropriate measures.
The main objective consists, using a set of data in the form of accident scenarios or incidents experienced on rail transport systems (feedback), to use this mass of data by automatic learning to stimulate the imagination of safety experts and support them in their difficult task of analyzing and assessing safety.
To our knowledge, this is the first study to use CBR systems to assess the safety of critical software used in rail transportation.
Positioning of the study in the safety standards applicable to critical software
Security-critical software must be bug-free and work according to specifications. The goal is to create robust, high-quality software that meets customer requirements. Therefore, critical software must be analyzable, testable, verifiable and maintainable. Software for rail transport must have high levels of reliability and safety. The evaluation of software consists in ensuring that the behavior of the software conforms to the needs of the user. This trust is installed when the designer can demonstrate that the software is safe and does not present any situation that is contrary to safety. Generally, in most cases, for complex software, this demonstration is laborious. Consequently, the development of safety software is generally subject to compliance with certain standards. In many critical industries, including rail, nuclear, aerospace and industrial, safety regulatory authorities require software to be qualified to specific standards in order to certify systems and therefore ensure a good level of safety.
The standard IEC/EN 61508 [9] aims to develop safety applications based on electrical and electronic systems. It is generally used for the development of “critical” software in the automation sectors as well as for industrial process control installations. Many sub-standards have been developed from IEC/EN 61508: EN 50128 and EN 50129 for the rail sector, EN 61513 for nuclear, ISO 26262 for the functional safety of road vehicles, etc. Regarding the certification of the design, realization (manufacturing, installation, testing, acceptance) and implementation of a rail transport system, it is carried out in accordance with CENELEC EN 50126 standards [10] (for railway systems), EN 50128 (for software of railway control and protection systems) [2], EN 50129 (for signaling, telecommunications and processing products and systems) [11] and EN 50657 (for on-board software of railway rolling stock) [12].
EN 50128 is a European standard which applies to all critical software used in rail transport. This standard describes the procedures, principles and measures to be used to ensure software safety. This standard provides a set of requirements that the development and maintenance of any safety software for railway applications must meet. It defines the requirements concerning the organizational structure, the relationship between organizations and the distribution of responsibilities involved in development, deployment and maintenance activities. This standard deals with the dependability of railway control and protection software for Signaling, telecommunications and processing systems. Ensuring the compliance of a project with standard EN 50128 means implementing a rigorous methodology whose objective is to reduce risks to an acceptable level. This standard comes from the European Committee for Electro technical Standardization (CENELEC). The published international version of CENELEC standard EN 50128 is the International standard IEC 62279 [13]. The content of the two standards is identical.
Positioning of the study in the software validation process
In the rail transport sector, the standard EN 50128 uses the Safety Integrity Level (SIL) as a measure of reliability and/or risk reduction. The standard requires that all systems with safety implications and containing software be assigned a SIL level. The integrity of the software is distributed over five SIL levels, ranging from SIL 0 (the least reliable) to SIL 4 (the most reliable). SIL is a performance measure required for a given safety functions and allows a target level of risk reduction to be specified. This allocation of safety integrity requirements depends on the respective contribution of hardware and software equipment to safety-related functions. The required SIL level must be decided and assessed, in particular according to the level of risk associated with the use of the software in the system. During the control and certification phases, the experts can establish whether safety software conforms to a particular SIL. The higher the criticality, the more abundant the verification tasks to be performed. Depending on the SIL level chosen, standard EN 50128 requires the implementation of a software lifecycle composed of several stages, each containing several methodological requirements: Software requirements, Software architecture and design, Design of software components, Realization and testing of software components, Software integration, Overall software testing/Final software validation, Software maintenance.
In the software development cycle, the most important and difficult phase involves testing and validating the software. Validation is defined by the standard as “an analysis process followed by an evidence-based judgment to determine whether software meets the needs of a user, particularly in terms of safety and quality”. This involves analyzing and testing the integrated “software/hardware” package to ensure compliance with the specification of the software requirements, focusing on the functional and safety aspects according to the SIL level and checking that it is suitable for its intended application. The validation manager must specify and execute additional tests or have them executed by the test manager. The added value to be provided by the validation officer consists of tests that solicit the system through complex scenarios reflecting the needs of the user. The validation manager must in particular examine the validity, consistency and adequacy of the test scenarios. He must ensure that the associated dangerous situation registers are examined and all dangerous situations are controlled.
The final objective of the SSEA method is to highlight critical points identified during the software development phases, and to offer to those responsible for validation tests, a summary of the criticality of the modules of the software analyzed, in order to refine their approach. However, the SSEAs are too precise, too specialized and too large for this validation to be able to be done manually. It is therefore necessary to partially automate the SSEA review task. The CBR seems to us particularly well suited to help the experts in charge of testing and validation to examine the SSEA documents. Indeed, all of these files represent a wealth of knowledge and know-how that it would be a shame not to exploit. In addition, the very nature of SSEA files encourages the use of a CBR. The CBR makes it possible to propose solutions quickly, without having to infer reasoning or an explanation from theoretical or too general knowledge. This can represent a considerable saving of time for the expert in not having to repeat, for each new SSEA file, a complete validation process. Remembering lived experiences helps prevent potential problems that have occurred in the past and encourages the expert to take steps to avoid these errors. The cases serve, in a way, as an intuition and a source of questions for the expert.
The place of the SEEA in the methods and techniques used to develop critical software
When you want to give confidence that software is sufficiently secure, you can use many techniques, methods and tools from the field of software dependability. There are many methods contributing to the development and evaluation of software: Software Error Effect Analysis (SEEA), Boolean Decision Diagrams (BDD), Detailed review, Metrics, Formal methods (Method B, Method Z, Vienna Definition Method (VDM)), N-versions, Overlay blocks, Source Code Control System (SCCS), Revision Control System (RCS), Concurrent Versions System (CVS), Dynamics tests (Code exploration, Non-regression tests, etc.), Proof of programs, Avoidance of faults (in order to prevent the occurrence of errors), Tolerance of faults (in order to preserve the service de-spite the occurrence of faults), Elimination of faults (Software tests, Critical re-view code), etc. The European standard EN 50128 offers several techniques and measures applicable to the five levels of software safety integrity. For each stage of the software life cycle and for each level of SIL, Annex A of the standard provides the appropriate “measures and techniques” that can be performed. The application of these techniques and measures is graduated according to the SIL level, as follows: M: Compulsory, HR: Highly Recommended, R: Recommended but not compulsory and NR: Not Recommended. The standard offers 71 methods and techniques for developing and assessing the safety of critical software used in the rail transport sector. The SEEA (Software Error Effect Analysis) method is part of this list of recommended methods. According to standard EN50128, during the software verification and testing phase, the SEEA method is “Recommended” for SIL1 and SIL2, but “Highly Recommended” for SIL3 and SIL4.
Software Errors and Effects Analysis (SEEA)
The safety is defined by the European standard CENELEC 50129 by the absence of any unacceptable level of risk. In order to guarantee an acceptable level of risk vis-à-vis humans, the system and its environment, safety experts use several methods of safety analysis shown in Fig. 1: Preliminary Hazard Analysis (PHA), Functional Safety Analysis (FSA), Analysis of Failure Modes, their Effects and their Criticality (AFMEC), Software Error Effect Analysis (SEEA).
The SEEA method in the safety the safety analysis process.
The study proposed within the framework of this article concerns the SEEA method and endeavors to develop a new approach to analysis and evaluation of the safety of critical software, based on machine learning and more precisely on the Case-Based Reasoning (CBR).
It is currently impossible to conclusively demonstrate that software is free of errors. In France and in the railway sector, coded single-processor technology is used to en-sure the safety of software execution. However, this technique does not provide protection against software design errors, code conformance errors, not coded safety software errors, and coded processor implementation errors. SEEA can, for its part, support, among other things, the analysis of these errors SEEA is a safety analysis approach whose purpose is to determine the nature and severity of the consequences of software failures SEEA also guides software validation and maintenance activities by identifying the most critical modules for safety. Indeed, SEEA makes it possible to estimate the level of effort of validation to be carried out on the various elements of the software and in particular, to guide the readings of code and to better target the tests. This analysis is performed by considering software error assumptions and examining the consequences of these errors on the other modules as well as any system-related failures SEEA finally proposes measures to detect errors and improve the robustness of the software.
According to European Standard EN 50128, the objective of an SEEA is threefold: 1) Identify the software components and their criticality; 2) Suggest a way to detect software errors and 3) Evaluate the amount of validation required on the various software components. Indeed, for each module of the software, it is necessary: 1) to suggest hypotheses of errors (calculation error, algorithm error, interface error, loss of output, erroneous output, loss of functionality, incorrect format, etc.), 2) to study the impact of each of the hypotheses on the module outputs, 3) to analyze the propagation on the modules surrounding the module analyzed and 3) to propose actions to detect and eliminate errors or avoid their spread. A SEEA finally makes it possible to propose recommendations for the design and coding of the modules in order to be able to guarantee the properties stated during the software specification phase.
SEEA is a method of inductive and systematic analysis which takes place in three stages [1, 2, 3]: 1) Identification of vital software components, 2) analysis of software errors and finally 3) synthesis. In the first step, you should determine the depth of the analysis (at the level of a single instruction line, a group of instructions, a component, etc.) for each component, from of its specification. The second step of analyzing software errors leads to a table listing the following information: Name of the component; Error examined; Consequences of the module level error; Consequences at the system level; Safety criteria violated; Severity of the error; Proposed means of error detection; Criterion violated if the proposed means of detection is applied; Residual severity if the detection means is applied. At the end, the last step is to identify the remaining anti-security scenarios and the validation effort required. This last step makes it possible to group, by module, the unsolved scenarios, the criteria not respected, the means of detection and the distribution of errors according to the criticality of their manifestation. SEEA analyzes begin during the software specification phase and end after testing and the integration phase. SEEA, as an in-depth analysis performed by an independent team, is a powerful method of detecting anomalies.
All of the above findings show that SEEA is considered as an important part of a system’s safety record. It is a fundamental document in the process of building and validating the safety of critical software. Nevertheless, the careful analysis of certain SEEA files of already certified or approved rail transport systems reveals some short-comings. On the one hand, SEEA documents have extremely varied representation formats from one manufacturer to another, and on the other hand, the process of drawing up and evaluating a SEEA dossier proves to be a particularly delicate and tedious exercise which is not supported by any formalized strategy. Indeed, the completeness and coherence of the analyzes remain essentially based on the know-how, the intelligence and the intuition of the experts of the field. These findings led us to use machine learning techniques, and in particular CBR.
Failure Modes, Effects and Criticality Analysis (FMECA)
In the literature, the SEEA method is presented, but has not been the subject of much study. Generally, this method is mentioned briefly following FMECA studies (Failure Modes, Effects and Criticality Analysis). It is important to specify that the SEEA method is derived from the FMECA method but adapted to software development. SEEA “software” is a transcription of FMECA “hardware” in a software environment. Developed by the American army in the 1940s, the FMECA method is generally regarded as a tool for dependability and quality management. Used throughout the design, development and operation process, this method has proven itself in the following industries: space, armaments, mechanics, electronics, automotive, rail transport, nuclear, aeronautics, chemistry, IT. This inductive method consists in systematically examining the potential failures of the systems (failure mode analysis) their causes and their consequences on the functioning of the whole (the effects). It is an inductive method that starts from the failures of the functions or components of the system to be analyzed in order to determine the dangerous failures that could affect it. This method highlights failures due to single failure modes that affect software or hardware.
Application of the FMECA method to software safety analysis
The FMEAC method has been applied in several studies to test and assess the safety and reliability of software.
As early as 1979, Reifer [14] proposed a reflection on the possible use of the FMECA method as a means to produce more reliable software. Bowles and Hanczaryk [15] propose an application of the FMEAC method to the modeling of the effects of threats from computer systems. Li et al. [16] shows the interest of this method in case of failure of the bogie of the rail freight wagons. As part of the study proposed by Ozarin [17], failure mode and effects analysis was applied when designing computer software. In Nguyen’s work [18], this FMECA approach was implemented to analyze and test the reliability of software in the digital multimedia distribution system. Ozarin [19, 20] explains the role and use of the FMECA method to help identify and correct the failures involved in the software and hardware interfaces. The work presented in Bowles and Wan [21] provides an example illustrating how software failure modes and effects analysis (FMECA) can be applied to an integrated control system based on a microprocessor.
Sozer et al. [22] proposed extensions to the FMECA method and the FTA method (failure tree analysis) in order to use them for the analysis of software reliability at the level of architecture design. Sayareh et al. [23] presented a combined approach based on the joint use of the FMECA method, the cause and effect diagram and Pareto analysis to reduce delays in cargo handling operations at marine terminals. Sulaman et al. [24] performed a comparative analysis between the FMEA approach and other safety analysis methods. The front collision avoidance system of a vehicle was selected in this study to compare and evaluate methods of hazard analysis. Stadler et al. [25] have developed software based on the FMECA method allowing the identification of risks to safety, reliability and customer satisfaction.
Related work: Examples of application of CBR
As part of this manuscript, our research focuses on the contribution of machine learning techniques, in particular CBR, to the safety of rail transport software. The CBR is generally interpreted as an important process for solving new problems based on finding similar solutions to the problems of the past. It is part of a behavior commonly used in solving everyday human problems. Indeed, all human reasoning is generally based on past cases lived personally. Very schematically, in the context of the CBR, a case is considered a problem with his solution as well as procedures allowing a justification of the decisions made on the way the solution was generated. Generally, the CBR involves an iterative process that revolves around the next major steps: The establishment of indexes (or indexing), Search for similar cases, Reusing cases, Revision and Learning.
Examples of CBR applications in the rail transport sector
In the field of transport, researchers have become increasingly interested in the application of AI techniques: railway maintenance [26], traffic control [26], detection of lateral rail faults [28], detection of rail surface faults [29], railway maintenance and safety [30, 31], management of railway applications [32], improvement of call reporting systems [33], implementation of a predictive approach safety [34] and Siemens for using Big Data to build the Internet for trains [35].
CBR is attracting more and more attention from researchers and experts in the transport sector. Our literature search covered three transport sectors: Air, road and rail. In the field of air transport we can cite, for example, the prediction of accidents and incidents [36]. In the road transport sector, the application of CBR is numerous: Transport planning [37], management of traffic flows [38, 39], control of urban intersections to avoid road congestion [40], the analysis of road collisions [41], the improvement of traffic in urban intersections by developing new signaling plans [42], the control of traffic flow at intersections (traffic control systems (TCS)) [43], the diagnosis of the driver’s stress level [44], or the modeling of the risk of driver fatigue [45].
Finally, in the rail transport sector, studies include the diagnosis of locomotive failures [46], the recovery of incident reports [47], the prevention of rail operations incidents [48], the command of railway rescue (Emergency Relief Command) [49], analysis of safety risks related to the operation of the metro [50], automatic train conduction to reduce travel time and save fuel consumption [51] and finally the diagnosis of failures of the rail switching system [52, 53].
Examples of CBR applications in software engineering
There are more and more approaches to applying CBR in software engineering. This work deals with problems of design, reuse of specifications, quality, stability, cost estimation, deadlines and finally prediction of software failures. The article by Adachi et al. [54] proposes an approach to assist in the design of software based on CBR. Previous design cases are recorded and, through the use of CBR, the design procedure and design results are presented to the designer as reusable candidates. In the context of specification reuse, Maiden and Sutcliffe [55] examine the CBR with the old specifications using examples of requirement reuse and successful case studies. This approach requires the use of domain knowledge to develop and validate the requirements specifications. In this study, the authors propose a paradigm for the reuse of specifications that exploits the skills and knowledge of the field possessed by software engineers. As part of Jani’s work [56], the CBR is used to assess the quality of software requirements by referring to cases of analysis of the quality of software requirements previously stored (experience feedback). Other CBR applications relate to predicting the quality and stability of software. As part of the quality study, Ganeshan et al. [57] present a case-based reasoning technique for predicting software quality factors. The approach proposed by Grosser et al. [58] relies on case-based reasoning to predict software stability. That is to say the ease with which a software element can evolve while preserving its design. Jiang et al. [59] propose a software cost estimation process based on case-based reasoning. Theng and Sultan [60] presented the application of the concept of software reuse based on the CBR to solve the problems of delays. Finally, two interesting articles on the prediction of software failures. Khoshgoftaar et al. [61] propose an approach for modeling the prediction of software faults. A CBR system functions as a software model for predicting breakdowns by quantifying, for a module under development, the expected number of breakdowns on the basis of similar modules which have been developed previously. Rashid’s article [62] explores case-based reasoning and its applications for improving the quality of software by predicting error models.
Approach for assessment of SEEA.
In our study on the analysis of the safety of critical software, the improve the quality of accident risk analyzes, directed us towards the development of a tool based on the RBC allowing to suggest potential accidents and/or measures of protection or prevention most appropriate to protect oneself against a particular risk. However, the artificial intelligence approaches cannot provide satisfactory answers to our research objectives. Indeed, despite the undeniable interest of these approaches, to our knowledge, to date there are no applications of artificial intelligence to improve the safety of critical software used in the rail transport sector and in particular tools to improve the SEEA method. Specifically, the bibliographic study carried out on ma-chine learning and in particular on CBR shows the absence of work on the use of CBR in the analysis and evaluation of the safety of critical software used in the rail transport sector. To date and to our knowledge, this is the first work in this area, which is one of the original features of our study presented in the next paragraph.
Approach adopted for the evaluation of critical software
In order to show the interest of machine learning and more precisely CBR in the field of the safety of railway transport, we have developed a tool called “Sautrel” which revolves around three phases [63, 64, 65]: Data acquisition, development of the case base in the form of accident scenarios and/or potential incidents and finally design of a tool based on CBR. The phase of acquisition and abstraction of the data involved in SEEA documents led to the development of a conceptual model based on eight descriptive parameters: system, subsystem, module, envisaged error, safety criterion, feared risk, severity of damage and finally the error detection means. Based on the study of two already certified rail transport systems (Maggaly and TGV North), we have built up a learning example base which brings together, to date, 250 historical cases relating to SEEAs. To show the interest and the feasibility of the approach, a tool was implemented using the software “Recall” from Isoft [66, 67].
These three major phases of development of the tool to aid the assessment of the safety level of critical software are detailed in Fig. 2 in nine steps presented in the following paragraphs. As shown in Fig. 2, in front of each step of the proposed methodology, we presented the result obtained. For example, Step 1 on Knowledge Acquisition and Modeling allowed the development of a generic SEEA representation model. Step 2 on the definition of the description language of the SEEA learning examples led to the elaboration of the descriptive parameters (or characteristics) of the SEEAs. Step 3 on the development of the database the SEEA made it possible to compile all the source cases, etc.
Acquisition and modeling of knowledge
This paragraph presents the results of the phase of formalization and acquisition of the knowledge necessary for the development of a historical case base (experience feedback) in order to capitalize and perpetuate the knowledge related to the SEEA. The first step of the study is devoted to the research and identification of descriptors and characteristic parameters to represent and formalize the SEEA. After a second step of data collection necessary to list the possible values taken by each parameter (or descriptor), the third step proposes, a formalism of representation of documents SEEA. Finally, on the base of this formalism, which constitutes the basic language of SEEA representation, the fourth stage of the study focuses on building the case base that currently comprises 224 cases, each of which represents a particular situation that is contrary to safety (Problem) and one or more preventive measures or corrective measures to guard against, avoid, reduce, or permanently eliminate the potential risk envisaged (Solution). To leverage knowledge of SEEA (or historical cases), it is necessary to adopt a model (or formalism) that is generic enough to cover as much as possible SEEA documents (or files) from several more or less different transport systems. To build this model and in order to show the feasibility of the study, we examined the SEEA relating only to two rail transport systems already certified and put into circulation in France: the automated system “MAGGALY” and the system TVM (track-to-train transmission) of the “LGV-Nord”. It is important to emphasize that each SEEA file is specific to a particular system and therefore it is necessary to perform sufficient analysis and abstraction work to cover the majority of systems. Indeed, this analysis presents some difficulties, since from one manufacturer to another, or even from one system to another, the formalism, the terminology or the level of deepening of the analysis implemented are different. At the end of this review, we finally proposed a first SEEA representation model that relies heavily on the manufacturers’ practices and our experience in the field of railway safety. This formalism is based on eight characteristic parameters: Studied system, subsystem studied, module studied, error envisaged (family, class, type), safety criterion not respected by the error, dreaded event, type and gravity of the damage, barrier and means for detecting the error. This model proposes a methodological framework for preparing SEEA files and thus contributes to ensuring the quality of future analyzes. An excerpt from this formalism is presented in Fig. 3. On the basis of this representation model of the SEEA forms, we have created a library of 224 typical cases.
Extract from the formalism elaborated for the representation of records SEEA.
This step allows you to enter the description language of an SEEA based on the eight descriptors listed above (Fig. 3). A descriptor is a couple (attribute, value). All attributes are symbolic. Three types of descriptors could be distinguished: Enumerated descriptors, multi-valued descriptors and unknown descriptors.
Developing the SEEA case base
It’s about creating cases by assigning a value to each attribute of the description language. This case base may subsequently be modified or consulted. The acquisition of the target case is done by entering the value or values of the different attributes. During this case base construction step, the concept descriptor “dreaded event” is left unknown because it represents the solution we are looking for in the case base.
Parameterization of the CBR process
During this step, the user must set different parameters to configure the CBR process. These choices concern both the descriptor that will represent the solution of the problem and the strategies of indexing, matching or adaptation. During this step, the user must set the following parameters.
The descriptor “concept”
The descriptor “concept”: The user must choose from all the descriptors the one that will represent the solution of the problem. In our example, the descriptor “concept” is the descriptor “dreaded event”. The problem, meanwhile, will be characterized by all the other descriptors.
Indexing strategies
In general, we use indexing rules that make it possible, on the one hand, to organize the case memory and, on the other hand, to express the relevant characteristics of the entries (the target cases) in terms of indexes. The process of extracting or choosing the source case strongly depends on the quality of the organization of the case memory. The memory organization mechanisms use several indexing techniques such as “Memory in bulk” or “Hierarchical memory”. In the context of the “Memory in bulk”, we use a sequential search algorithm which consists, for all stored cases, of comparing the target case with the extracted case. It returns the most similar cases. The exploration is systematic and it is very easy to add a case but extracting one is very expensive because the memory must be covered entirely. In the context of the second indexing mechanism based on a “hierarchical memory”, cases are accessed through a tree or an indexing graph. Each node of the tree corresponds to a logical partition of the case base. Finding the most similar set of cases returns to the level of each node, finding the best son of the tree. This method is effective in search time, but it is more difficult to add a case (it must be inserted into the tree in the right place). In our sample application, the tool offers several strategies for prioritizing memory. The user can set this hierarchy by sorting the descriptors or trimming the hierarchy. In our example, we construct the hierarchy by taking into account all the descriptors and by imposing the descriptors “studied system” and “studied subsystem”, in this order, as first and second level of the decision tree. Then, the choice between the remaining descriptors for the next levels will be done by a decision tree classification algorithm: Quinlan ID3 algorithm [68].
Matching strategies to search for similar cases
Given a new problem to be solved (target case), it is, from a known case base (source cases), to find the most similar case (s) and relevant to solve the new problem. In this step, we generally use matching rules or similarity measures such as the “connective model” which imposes on each of the characteristics of the target case to be sufficiently close to all those of the source case or else “the disjunctive model” which evaluates the source case on its particularity closest to that of the target case. In this case, a source case will be considered acceptable if it is very close to the target case on at least one relevant characteristic, regardless of the value of the others. Most CBR systems evaluate the similarity of two cases by accounting for their common characteristics: This is the Euclidean distance. In our sample application, the user can intervene in several ways in calculating the similarity between two attributes. It can possibly specify the descriptors which will not have to be taken into account during the computation. It can also give a weight vector to indicate the relative importance of a descriptor over others. In our example, we chose to extract only the 10 most similar cases, and to give a weight equivalent to all thedescriptors.
Adaptation strategies for reuse of similar cases
Generally, there are two possibilities: 1) if the case found in the database (source case) is identical to the new problem to be solved (target case), then the solution of the problem is immediate; 2) Either the case found presents a certain similitude (or analogy) with the new case, then an adaptation procedure is necessary whose objective is to adapt the solution found to the need of the new situation (target case). Thus, in the first hypothesis, we apply directly the solution found and in the second hypothesis, we must find a suitable technique to adapt the recovered solution and include it in the new problem. In our sample application, to date, the tool does not offer a real adaptation method, but allows the user to program his own methods with daemons. Currently, this adaptation can be done either implicitly by the safety domain expert, by comparing cases similar to the target case, or by the voting technique. In this second case, the value of the attribute to be adapted is calculated on all the similar cases by a vote weighted by the percentage of similarity of each case. For example, if a case C has 3 descriptors of which 2 are 100% similar to the target case and the third descriptor has no similarity (0%), then case C will be similar with the target case at 66%. If all the descriptors are of equal weight: (100
Entering the new SEEA target case
The acquisition of the target case is done by entering the value or values of the different attributes. W will leave the concept descriptor “dreaded event” unknown because it represents the solution we are looking for in the source case base.
Indexing of the SEEA case base
After developing the SEEA case representation mode, i.e. the description of the problem and the solution in the form of descriptors (attribute/value), it is then necessary to build a model for organizing and indexing the memory. This model is essential in the search for similar cases and must have certain qualities. Knowing that the research phase of similar cases must keep a constant complexity as the case base is filled; it is wise to consider a solution to quickly find similar cases. To apprehend this problem, we use the indexing method where each node of the tree corresponds to a question on one of the indexes and the threads of the tree correspond to the different answers. An index represents the elements discriminating the cases and has two fields: its name and its value. To ensure a minimum of efficiency, the tree, which is dynamically built, must ask the questions in the right order and be as shallow as possible. The best way to build it is to use the decision tree method. Decision tree consists of nodes corresponding to the attributes of the selected objects and branches characterizing the alternative values of these attributes. The leaves of the tree represent the sets of objects of the same class of objects. The construction of decision trees is a top down generalization approach. The ID3 of QUINLAN algorithm [68] is a typical case of a downward approach. ID3 uses a heuristic search strategy, according to the gradient method, by optimizing a numerical criterion called gain of information which is based on the entropy of SHANNON developed in the early 1940s by Claude Shannon [69].
From:
A set of exclusive classes {C1, C2, …Ck}; A set of examples {E1, E2, …En} represented in the form of pairs (attribute/value) and partitioned in classes Ci;
ID3 produces a decision tree that allows to recognize (or classify) all the examples E
QUINLAN’s method consists in successively testing each attribute to know which one to use first in order to optimize the gain of information. That is, the attribute that best distinguishes between examples of different classes. This principle has been applied in many cases and has contributed to the development of several expert systems, essentially dedicated to diagnosis. Subsequently, work was devoted to improving the principle of construction of the decision tree and in particular reducing the size of the tree, improving the selection strategy (which is based in ID3 only on the attribute) by proposing a selection based on both the pair (attribute/value) or the improvement of the representation mode of the examples, by using a representation based on diagrams (frames). Used in a variety of fields such as data mining, business intelligence, medicine, safety, etc., the decision tree is a decision support tool that represents a set of choices in the form of graphical data (tree). In our case of application to SEEA, we use the classification algorithm ID3. During this indexing or prioritization step, the user selects the case base to index, and then starts the construction of the hierarchy. In our example (Fig. 4), the first two levels of the hierarchy are constructed from the descriptors “studied system” and “studied subsystem”. Here, the third level deals with the descriptor “Severity of the damage”.
Example of the instances base hierarchy.
The Before searching for similar cases, if some information is missing (for example, a value of an attribute not specified), it is possible to complete the knowledge acquisition phase by querying the domain expert. There are some learning tools to try to determine and correct this data. In our case of application, during the phase of acquisition and collection of SEEA data, particular attention was paid to this problem of noisy or inconsistent data. The search for SEEA cases similar to the target case, is broken down into two filtering and selection stages that use static and dynamic indexes. There are different ways to determine the characteristics of indexes: All characteristics, some characteristics, the most discriminating characteristics, etc. In our application we adopted a similarity search based on the set of characteristics. To find similar SEEA cases from the case database archived in memory (source cases or reference cases), several techniques can be used, such as the “Nearest Neighbor” algorithm whose objective is to measure the similarity between the problem (target case) and potential source cases. The comparison method is based on the indexes. Thus, from the similarity on each index, the algorithm generates the global similarity sought. Let’s remember that the search for nearest neighbors, or k nearest neighbors commonly used in machine learning, consists of starting from a set of other points to find the nearest K (similar) points. Generally, to optimize this method, we use heuristics and selection strategies to quickly find the most useful cases to solve the problem. The cases that share the most important characteristics, the easiest cases to adapt or the most used cases are examples of heuristics. In our application example, from the historical case base (source cases), it is a question of finding the SEEA cases most similar to the SEEA cases to be evaluated (target case) and who share the most important characteristics. The screen shown in Fig. 5 shows, for our example, the result search for similar cases. The target case is recalled in the right column, the left column proposes the first 10 most similar cases and the middle column shows one of the similar cases (here case 33).
Visualization of similar cases extracted from the case base.
Example of the reference cases consultation and the vote technique use.
Suppose we found a similar case, so we reuse directly the solution he proposes to solve the problem (case target). In practice, it is often rare that we find a case identical to the problem, so it is necessary to adapt pre-existing solutions. Adaptation therefore consists of building a new solution from the target case and similar cases found. It is then necessary not only to look for the difference between the cases found (source cases) and the problem, but also to find the useful information to be transferred to the new solution. Generally, one distinguishes two types of adaptation: Transformational adaptation and derivative adaptation. In the first approach, it is a question of directly reusing the solutions of the past cases. This type of transformational adaptation does not tell us how the solutions of similar cases were generated. It is the role of derived adaptation that allows, for each case stored in the database, to explain the reasoning process leading to the solutions. In this case, the derivative adaptation consists in applying the same reasoning to the new problem by choosing the paths taken by the old solutions selected and thus avoiding any unsuccessful paths. In our application case, the “ReCall” tool used to demonstrate the feasibility of the proposed approach does not yet propose relevant adaptation strategies. To date, the adaptation phase is still assigned to the user and in particular to the safety expert. With the screen presented in Fig. 6, the user can consult the value taken by the concept attribute “dreaded event” in each similar case and choose himself the value to give to the “concept” attribute for the target case. The user can also use the voting technique. In our example, the tool proposes a single value for the attribute “dreaded event”: Train collision. Thus, the domain expert can adapt the most similar case (proposed by the tool) by assigning the “Feared Event” concept the value “Collision” as a solution to the problem. Since the “ReCall” tool does not propose adaptation strategies, the adaptation phase is limited in our example to indicate the class of potential solution. The solution sought is therefore focused simply on the value of the concept “feared event” proposed by the tool: “collision”. Nevertheless, this knowledge is necessary to stimulate and assist the expert in his task of safety assessment. Indeed, faced with a new problem (scenarios of accident/potential incident) described by a set of characteristic descriptors, it is interesting to know the possible feared event or events (collision, derailment, electrocution, fall).
Updating the SEEA base
This last step of updating knowledge is to perform the automatic learning by adding the appropriate target case in the SEEA historical case base. In the “ReCall” software, this learning is not incremental since the new case will be integrated into the hierarchy without it being reconstructed. It is up to the user to take the initiative to revive the indexing of the case base. Therefore, during this phase of the CBR cycle, it is wiser that the new case with its new solution is validated by the domain expert before being added to the case base (source cases). In addition, it is interesting at the end of this learning phase to test the system by relying on the same problem that it has just treated to ensure that the system behaves as expected. Finally, it is essential to determine how to index this new case in the database without questioning the historical knowledge learned in previous phases and thus avoid new problems of inconsistency, redundancy, etc. In particular, the focus must be on this problem of incrementality. Should we adopt a monotonous incremental learning approach (accumulation of knowledge without questioning knowledge previously learned) or non-monotonous (examination of knowledge learned with each addition of new knowledge)? This is a problem that remains crucial in almost all machine learning systems. As part of our prototype of feasibility, this work has not yet completed.
Conclusion
In order to rationalize and reinforce conventional approaches to safety analysis and assessment, we have agreed to use artificial intelligence and machine learning techniques and in particular case-based reasoning (CBR). The main objective consists, from a set of data in the form of accident scenarios or incidents experienced on rail transport systems (experience feedback), to exploit by automatic learning this mass of data in to stimulate the imagination of safety experts and assist them in their difficult task of analyzing and evaluating the safety of new critical software. This historical data concerns SEEA. The implementation of this railway safety assessment approach required not only the use of machine learning but also knowledge acquisition methods to collect, structure and formalize the knowledge involved in SEEA. The knowledge acquisition phase ultimately culminated in the implementation of a conceptual SEEA representation model that provides a methodological framework for safety experts. Based on this model, we acquired 224 cases of SEEA (historical basis for learning). This learning base is based on experience feedback from two rail transport systems put into service in France. The first Maggaly system in Lyon is fully automated and the second system relates to a High Speed Line (TGV-Nord). When it comes to machine learning, our work is part of supervised learning. Indeed, the presence of the safety expert is essential to ensure effective and relevant learning. The domain expert is not only able to control, validate, adapt and complete the knowledge learned by the system, but also to adjust certain learning parameters. To demonstrate the feasibility of the proposed approach, we used a case-based reasoning generator named “ReCall” from ISOFT. Despite the undeniable interest of this ReCall tool, several shortcomings have been noted in particular for methods for calculating similarity, coping strategies and processing missing values (noisy data). However, this contribution made it possible to demonstrate the feasibility of a new approach to modeling, capitalization and evaluation of the SEEA method, based on the use of machine learning techniques. This approach of evaluating critical software, used in rail safety, is also based on the joint and complementary use of machine learning and knowledge acquisition techniques to reinforce and systematize the phase of acquisition and transfer of knowledge in the field of railway safety. The originality of the tool developed lies not only in its ability to model, capitalize, sustain and disseminate SEEA expertise, but to the best of our knowledge, it represents the first research on the application of CBR to SEEA. In fact, in the field of rail transport, there are currently no software tools for assisting SEEAs based on machine learning techniques and in particular based on CBR. Currently, project is at the mock-up stage. Initial validation has demonstrated the interest of the suggested approaches, but improvements and extensions are required before they could be used in an industrial environment or adapted to other areas where the problem of investigating safety arises. These improvements include the improvement of the adaptation strategies of the solutions proposed by the system, the enrichment of the SEEA case base to cover the whole problem and finally, it is necessary to construct an integrated version of a prototype in order to finalize the results of demonstration model.
