An agent-based method for trust graph calculation in resource constrained environments

Abstract

In future resource constrained systems such as Industry 4.0 and Cyber-Physical Systems, thousands of resource constrained embedded devices communicate in order to reach a common objective. Information, then, is processed and transformed several times since it is acquired at physical level until it is received by user applications. Moreover, usually, each component in an current engineered system has almost no information about the other deployed modules. In that way, any malicious component may severely affect systems through the modification of operation data, as high-level applications can check the validity of received information in a hard way. Therefore, future technological components should only communicate with other trustworthy modules. Typically, solutions to address this problem are based on heavy technologies (such as certificates) which do not fulfil the requirements of embedded devices making up future systems. Thus, in this paper, a light computational solution to calculate the graph representing the trustworthy routes through which an application may obtain information is provided. Systems are represented as directed graphs, and untrustworthy branches are identified based on probability offered by a biparametric stochastic process (which is estimated by means an agent-based solution, consisting of a static and a mobile agent). Untrustworthy paths are also pruned following a hierarchical and scalable process. The proposed solution is based on the execution of some “elemental requests” whose response is known. Using the NS3 simulator, virtualization technologies and TAP bridges, the proposed algorithm is validated in a simulation scenario representing a real deployment.

Keywords

Trust security resource constrained environments graphs pruning algorithm

1. Introduction

Recently, several different paradigms to support future digital and computational systems have been proposed. Among them, one of most popular proposals nowadays is Cyber-Physical Systems (CPS). CPS are integrations of computation and physical processes [1]. Contrary to other previous works about embedded devices, CPS are focused on communications. Thus, they are primary designed to be networked (usually making up big networks), working all the connected members for a common objective [1]. Around CPS many other definitions and technological paradigms have been reported, as (for example) “Industry 4.0”. Industry 4.0 [2], as defined during the 2014 Hannover Fair, is understood as the new industrial period when CPS, the Internet of Things (IoT) [3] and Next-generation Internet Services [4] are massively employed as supporting technologies for production systems and solutions. Although, as said, different technologies may be employed to deploy Industry 4.0 systems, CPS paradigm is the most employed one [5]. In particular several architectures and reports about the future challenges in this field have appeared [6].

Figure 1.

Example of architecture for an Industry 4.0 system [66].

Any case, all these proposals present a very important and common characteristic. Both, CPS and Industry 4.0 systems are characterized by the complexity of the underlying networks, involving thousands of resource constrained devices and other components or modules (such as software applications, middleware, etc.). In a typical Industry 4.0 system (see Fig. 1) around five or six different abstraction layers are defined (following the proposed reference architecture for CPS by the National Institute of Standards and Technology [7]). These layers are isolated, and components belonging to each one of them have almost no information about the organization, modules and behavior of the other layers. Thus, important meta-information about the operation data and provided results is not accessible, especially, from the highest level, where user applications are hosted. Using this approach, different middleware may be deployed (such as those required to execute mobile agents) in an independent manner from heterogeneous hardware and embedded software components that usually made up platforms at low level.

In this context, systems are vulnerable to a new and more general type of attacks known as cyber-physical attacks [8].

These attacks are based on the strong dependencies among components in future technological systems, as well as on the great number of them composing the system, to perform malicious effects in a simple and “noiseless” way [9]. For example, as Industry 4.0 systems are usually employed to perform control policies [10], a secondary and/or small intermedium component in the system may be attacked to provide false information, so user applications operate and decide control actions based on false data. In that way, control loops may magnify the initial effect, and the entire system may fail starting from the infection of a non-relevant or critical module.

Therefore, as most recent technological systems (such as pervasive computing environments) are not centralized, each component (and especially user applications with full system access) should evaluate which components are trustworthy and can be considered to provide trustworthy information. Components only providing false data must be pruned and isolated [11], and modules with a variable behavior have to be requested to evaluate the received information before submitting it to next component.

However, traditionally, trust in technological systems is managed through computationally heavy technologies (in terms or memory and processing time, especially for resource constrained devices) such as certificates, which do not fulfill the special requirements of resource constrained embedded devices making up future engineered systems such as Industry 4.0 solutions (they present a reduced computational power, a limited communication range, etc.).

In that way, this work is motivated by the need of new, fast and lightweight security solutions to evaluate the components’ trustworthiness in Industry 4.0 systems. These new solutions must be adequate for resource constrained devices, which cannot store in a permanent manner (due to memory limitations) trust evaluation algorithms and their related data. Besides, these devices cannot maintain a complete understanding or model about the system where they are integrated, so new algorithms must be based on simplified models which allow a broad security analysis although they only partially represent the system. Finally, complex statistical methods are not admissible as a large processing time would be required; on the contrary, trust evaluations should consist of simple key performance indicators. Thus, the objective of this paper is to propose a new lightweight computational solution to fulfill this gap.

As a novelty, in our solution Industry 4.0 systems are represented as graphs, where only communication links and cyber-physical components are considered (other elements, such as configuration mechanism are not represented, for example). In that way, complex and large systems may be modeled in a lightweight manner. Then, using simple statistics about exchanged traffic among the different components, trust is evaluated in a fast manner. Besides, trust evaluation algorithms are not permanently stored in embedded devices, but executed through mobile agents, which reduces the resource consumption of this solution. In particular, the proposed solution represents resource constraint environments as directed graphs, which are analyzed in a hierarchical and scalable way using agent-based techniques. In order to do that an epidemic protocol based on a sequence of “elemental requests” is proposed. Depending on the obtained results, branches in the system may be pruned or filters can be installed. When the configuration process ends, each component has calculated a “trust circle” including the components which provide it with correct data. In order to simplify the mathematical notation and the discussion, proposal is particularized to Industry 4.0 systems, although it is applicable in an identical way to any other resource constrained environment or system (such as CPS, IoT, pervasive sensing platforms, etc.).

As main advantages, the proposed solution presents a shorter convergence time, as simple traffic statistics are calculated much faster than previous complex statistical methods. Besides, the resource consumption in resource constrained embedded devices is also low, as models based on graphs and pruning techniques are lightweight technologies. Moreover, mobile agents guarantee computational resources are only employed for a short time in an opportunistic manner.

The rest of the paper is organized as follows. Section 2 presents the state of the art on Industry 4.0 solutions, trust management, graph pruning techniques and agent-based solutions for CPS. Although Industry 4.0 are integrations of hardware and embedded software for industrial applications, in this paper we are focusing on system modeling and transformation, using the graph theory and its applications and related technologies (trust calculation, pruning techniques, etc.).

Section 3 describes the main computational contribution, including the mathematical formalization and the epidemic protocol. In this case, as said, the proposal is mainly based on system modeling through graph theory as initial step to perform calculations and transformations over the system using agent-based techniques.

Section 4 describes the performed experimental validation (based on simulations), and Section 5 presents the obtained results. Section 6 concludes the paper.

2. State of the art

This section presents an overview of the related research in the two main fields treated in this work: trust management and graph pruning algorithms.

2.1 Industry 4.0 solutions

Technically, and Industry 4.0 system is an industrial system based on CPS, but considering some special characteristics [12].

First, system heterogeneity, complexity and dynamism are pretty lower than in standard CPS [12]. Although some of the main problems to be addressed in CPS is self-configuration [13] and modeling [14], Industry 4.0 solutions usually consider static CPS, where architecture and components are stable with time. In fact, the use of CPS in Industry 4.0 refers a system where information related to the physical shop floor and the virtual computational space are highly synchronized [15], without considering other relevant characteristics (very relevant in other application scenarios). Industrial systems tend to be more homogenous and statistic than other technological solutions, a requirement that is usually considered in firstly reported Industry 4.0 real deployments [16].

Second, CPS are understood as a platform similar to other paradigms such as Internet of Things (IoT) or Smart Factories [17]. Actually, Industry 4.0 systems are characterized to integrate next-generation digital solutions such as smart networking, new digital business models or flexibly production schemes [12]. In this context, any system addressing these problems is considered an Industry 4.0 solution, although it does not fulfill traditional characteristic of CPS.

And, finally, Industry 4.0 are resource constrained environments [18]. Standard CPS may be composed of powerful components, but Industry 4.0 only considers pervasive sensing resource constrained platforms, where sensor nodes are employed to obtain real time information about critical infrastructures and processes (enabling new and advanced control techniques and solutions).

These characteristics increase the risk of Industry 4.0 of suffering a cyber-attacks in comparison to standard CPS. First because they are applied to critical systems, second because components are resource constrained and cannot implement complex protection policies [19], third because architecture is static, and attacker could take profit of bottlenecks and other similar problems; and finally because they implement advanced technological solutions which may present unknown security problems.

Nowadays, only initial proposals about Industry 4.0 have been reported; mainly architectures and reports about pending challenges [18]. However, some real deployments may be also found, including process execution systems [20] and security solutions [21].

The proposed agent-based solution is primary designed for CPS supporting Industry 4.0 systems. Thus, some additional assumptions may be done (see above) in order to develop our proposal and design the validation methodology (static architecture, low heterogeneity level, etc.). However, the proposed technology could be also applied to other resource constrained systems, including standard CPS (although performance may be different than showed in this work).

2.2 Trust management

Trust management refers all technologies focused on selecting the conditions on which a component is able to interact with other components in the system [2]. Questions about trust decisions such as when an interaction may take place, or with who a component should interact are traditionally addressed in works about trust and multi-agent systems [23]. However, most Industry 4.0 systems are not multi-agent solutions, but intelligent sensing platforms. Thus, although we are using agents as an enabling technology, we cannot consider Industry 4.0 systems as multi-agent scenarios, as it is not guaranteed all elements to be able to make decisions in a flexible and autonomous manner.

Trust management is an interdisciplinary idea, which has been investigated in different areas: from the Internet-of-Things and Peer-to-Peer systems (P2P), to social and autonomic networks.

There is little work on trust management for IoT [24]. Moreover, most of these works are related to the “social IoT”, a concept which refers the idea that common objects may interact with other daily living things and create a social network as people do [25]. In this context, contributions about protection, security and intrusion detection using trust models are divided into two groups.

In the first group, articles deal with trust evaluation. In these papers, researchers try to determine the cyber and physical parameters which should influence trust among IoT components. Typically, direct and indirect observations are employed to extract the required information through some data analytics procedures [26]. Observations may be related to networks parameters (such as the packet loss rate) [27] or may be based on the social network theory [28]. However, taxonomies about uncertainties and trust [20], enhanced processing algorithms [30] and uncertainty models [31] have been also reported. In all these proposals, trust is continuously evaluated and, employing a specific threshold, it is determined whether components communicating with a certain module in the system are trustworthy or not [32].

Sometimes, trust is evaluated through reputation models and frameworks, defining this concept as fuzzy information distributed around all the system [32]. Concepts such as local and global reputation are then proposed [11].

These previous proposals are useful but require a high computation capacity in Industry 4.0 nodes to be executed. Complex statistical frameworks or large databases have to be maintained. Moreover, these works usually try to “infer” the trust a component deserves, so protection actions are not necessary supported by clear evidences.

In the second group of works, papers are focused on recommendation algorithms. In this case, trust is calculated from opinions publicly shared by components in the system [33]. Phenomena, such as the trust reduction when receiving opinions from distant components, are evaluated [34]. In these papers, trust (and distrust) propagation in simple graphs is formalized, defining ideas like the iterative propagation of trust [35].

Social and automatic networks present an intrinsic social behavior [36]. In these scenarios, the concept of trust graphs (graphs representing the trust relationships in a system) is proposed, and (considering the particular proposed definition for trust) some algebraic properties of trust are obtained and validated [36]. As in social networks, usually, nodes publicly share their opinion; trust is typically based on voting in these papers. As in the first group, these contributions try to detect untrustworthy nodes in the network (although it is not specified how the network is protected or reacts once the malicious node is located).

Different contributions in the context of sematic technologies have appeared in relation with trust. Specific data formats for opinion and trust calculation algorithms (similar to weigh calculation techniques in neural networks [37]) have appeared [38]. Moreover, different formats to represent reputation have been also reported in other fields [39].

Finally, in the field of P2P systems (and ad hoc networks), different types of trust are defined [40], which are employed to propose new ideas such as “trust witnesses”. In these proposals, opinion is usually composed of various parameters (such as trust and confidence) which made up an “opinion space” [41] where algebraic structures [42] (e.g. semirings) are defined. In this field, besides, the idea of “trust graph” as protection or security policy is connected with other techniques as certificate distribution [43] (i.e. trust implies delegation) or reputation-based sharing platforms [44].

All the cited proposals, nevertheless, only aim to evaluate the trustworthiness of nodes (components in the Industry 4.0 system), keeping the same system architecture regardless of the obtained results. In that way, trust graphs are understood as graphs which include information about trust (but which preserves the original topology of the system and presents the same behavior). On the contrary, in our proposal, trust information is employed to dynamically modify the system topology, so during operation no more calculations are required (and resources may be used in a more efficient way). In order to reduce the global system graph to the trust graph, malicious components are pruned using the most appropriate technique. Pruned elements may be included another time in the system if their untrustworthy behavior changes.

2.3 Pruning techniques

Pruning techniques refers those algorithms and solutions focused on detecting and removing branches or nodes (components) in graphs, neural networks, workflows, etc. fulfilling certain conditions [45].

Different graph pruning techniques have been reported in the research literature.

A first type of proposals consists of removing a node without losing connectivity in the graph, by creating direct connections between the points previously connected through the removed node [46]. This approach is usually named as “creation of a dense graph”.

Other proposals consider different characteristics in the nodes (such as if they are dominant) to prune graphs following certain rules (for example, both created sub-graphs must include a dominant node) [47].

A particular case of pruning techniques is which are applied to decision graphs. In this case, when a node is removed, all its successors are also deleted. This technique is usually employed to simplify the decision-making process [48], as the resulting graph may be severely reduced compared to the original [49].

Other type of pruning techniques is focused on removing loops in graphs [50]. This technique, contrary to the previously cited ones, does not remove any node but connections among them.

With the same idea, other proposals have been reported, for example, calculating pruning actions by means of a forward-backward algorithm [51].

Pruning techniques based on graph simplifi- cation [40] are unhelpful in relation to Industry 4.0 systems, as with this approach large parts of the system may be get isolated and useless. Thus, with the proposed technique, vertices are not removed and only edges associated with the component under study may be affected. Solutions focused on removing loops are inapplicable, as it is not the focus of the article (in fact, in this work graphs with cycles are considered and admitted). Finally, in Industry 4.0 systems, components cannot be removed by a peer (another module), as this task is reserved to the system administrator. Then, the proposed technology should be focused on pruning graphs by removing edges instead of nodes. The proposed algorithm employed to select the edges to be pruned is explained in the next section.

2.4 Mobile agents and CPS

Agent-based solutions (especially when employing mobile agents) are a very well-known technology. For more than thirty years, different mechanisms and techniques have been reported [52]. This interest for mobile agents is still alive nowadays, and agent-based solutions for new paradigms such as Cyber-Physical Systems and Internet of Things (IoT) are constantly reported [32, 53].

Agent-based solutions for CPS cover a large catalogue of applications: from smart object implementation [54] to human-computer interaction models [55]. However, most common agent-based proposals for CPS are related to security issues [32, 56]. Actually, some important authors have evaluated how mobile agents break some traditional assumptions in computer sciences, opening new security issues which prevents the use of this technology in CPS and IoT [57].

On the other hand, CPS are very heterogeneous systems, a characteristic that may also prevent the integration of agents into CPS [58]. However, techniques to overcome the CPS heterogeneity and apply agents and other similar software technologies to CPS have been reported in the last years [59, 60]. In order to guarantee the feasibility of the proposed solution, the described technology in this paper should be combined with any of the existing mechanism to implement mobile agents on heterogeneous CPS.

Focused on the topic of this paper, different security solutions for CPS employing mobile agents may be found. Recent proposals addressing this topic include cyber-protection assurance through secure agent implementations (including, for example, a Public Key Infrastructure) [61]. However, most of these solutions employ agents to collect information about the system and make a decision about its operation in a previously defined bifurcation point [62]. Other solutions consider security point injection in workflows, which are managed by mobile agents, or intrusion detection mechanisms [5]. These proposals support the usability of mobile agents in the context of CPS for security applications; but proposed solutions are, nowadays, still very heavy and computationally costly. Moreover, they are focused on very particular scenarios, so more general and lightweight solutions are needed.

In the proposed solution, to create a lightweight mechanism, mobile agents do not collect raw data to be later processed but obtain simple traffic statistics which may be represented using small data structures. Thus, mobile agents are smaller and consume fewer communication and computational resources. Besides, generic parameters are studied, so the proposed technique is adequate for any CPS application scenario.

3. Trust graph calculation

In this Section a mathematical formalization of the proposed contribution is provided. Resource constrained environments (particularized into Industry 4.0 systems) are represented as directed graphs, which are pruned to obtain the trust graph of each component. Data structures and protocols employed to support the proposed algorithm are also described.

Previous works [11] described a simplified calculation process where only systems with one-to-one correspondences between hardware devices and user applications were allowed. In this paper we extend these previous proposals considering arbitrary systems.

Among all questions to be answered as part of the “trust” problem in technological systems, in this work we are dealing with the challenge: which components in the system should a certain element interact with?

3.1 Fundamentals of the mathematical modeling

An Industry 4.0 system ${\mathfrak{T}}$ is an ordered pair of finite sets ${\mathfrak{T}}=\left({C_{\mathfrak{T}},L_{\mathfrak{T}}}\right)$ , being $C_{\mathfrak{T}}$ and $L_{\mathfrak{T}}$ non-empty sets. $C_{\mathfrak{T}}$ is named as the collection of system components, whose cardinality is $N$

$\displaystyle C_{\mathfrak{T}}=\left\{{c_{i}\,\,i=1,\ldots,N}\right\}$

and $L_{\mathfrak{T}}$ is the set of communication links, being $\textit{card}\linebreak\left({L_{\mathfrak{T}}}\right)=K$

$\displaystyle L_{\mathfrak{T}}=\left\{{l_{i}\,\,i=1,\ldots,K}\right\}$

Both sets $C_{\mathfrak{T}}$ and $L_{\mathfrak{T}}$ are related by means of three different applications

$\displaystyle\gamma_{\mathfrak{T}}:L_{\mathfrak{T}}\to C_{\mathfrak{T}}\times C% _{\mathfrak{T}}$ $\displaystyle\sigma_{\mathfrak{T}}:L_{\mathfrak{T}}\to C_{\mathfrak{T}}$ $\displaystyle\tau_{\mathfrak{T}}:L_{\mathfrak{T}}\to C_{\mathfrak{T}}$

$\gamma_{\mathfrak{T}}$ is the incidence application of the system, and relates each communication link $l_{i}$ with the order pair of components $\left\{{c_{i},c_{j}}\right\}$ this link connects. The link is outgoing from $c_{i}$ and incoming in $c_{j}$ .

$\sigma_{\mathfrak{T}}$ is the source application of the system, and associates each link $l_{i}$ with the component $c_{i}$ which is the origin of the communication link $l_{i}$ .

Finally, $\tau_{\mathfrak{T}}$ is the target application of the system, and associates each link $l_{i}$ with the component $c_{j}$ which is the destination of the communication link $l_{i}$ .

In that way, an Industry 4.0 system may be understood as a directed or oriented graph $G_{\mathfrak{T}}$ characterized by any of the two following tuples

$\displaystyle G_{\mathfrak{T}}=\left({V_{G},E_{G},\gamma_{G}}\right)$ $\displaystyle G_{\mathfrak{T}}=\left({V_{G},E_{G},\sigma_{G},\tau_{G}}\right)$

$V_{G}$ is the set of vertices of the graph, and $E_{G}$ is the set of edges. $\gamma_{G}$ is the incidence application of the oriented graph, and $\sigma_{G}$ and $\tau_{G}$ are the source and target functions, as defined in the mathematical graph theory.

The graph $G_{\mathfrak{T}}$ is said to represent the Industry 4.0 system ${\mathfrak{T}}$ if there exists a pair of bijections $\left({\mathfrak{F}_{V},\mathfrak{F}_{E}}\right)\mathfrak{F}_{V}:C_{\mathfrak{% T}}\to V_{G},;\mathfrak{F}_{E}:L_{\mathfrak{T}}\to E_{G}$ fulfilling that $\forall c_{i},c_{j}\in C_{\mathfrak{T}}$ and $\forall l_{i}\in L_{\mathfrak{T}}$ such $\gamma_{\mathfrak{T}}\left({l_{i}}\right)=\left\{{c_{i},c_{j}}\right\}$ and $\sigma_{\mathfrak{T}}\left({l_{i}}\right)=c_{i}$ and $\tau_{\mathfrak{T}}\left({l_{i}}\right)=c_{j}$ . It is proved that $\exists v_{i},v_{j}\in V_{G}$ and $e_{i}\in E_{G}$ such $\gamma_{G}\left({e_{i}}\right)=\left\{{v_{i},v_{j}}\right\}$ and $\sigma_{G}\left({e_{i}}\right)=v_{i}$ and $\tau_{G}\left({e_{i}}\right)=v_{j}$ if, and only if, $v_{i}=\mathfrak{F}_{V}\left({c_{i}}\right)$ and $v_{j}=\mathfrak{F}_{V}\left({c_{j}}\right)$ and $e_{i}=\mathfrak{F}_{E}\left({l_{i}}\right)$ .

In order to clarify this point, Fig. 2 shows a graphic representation of both functions. As can be seen, these functions allow transforming a technological system into a graph and vice versa, so (hereinafter) we are using indistinctly both notions.

Figure 2.

Correspondence between mathematical graph and Industry 4.0 system.

Moreover, vertices and edges in the $G_{\mathfrak{T}}$ graph may inherit some properties from components and communication links in the Industry 4.0 system. In particular, each vertex has associated a collection of three relevant parameters $\left({ID,\textit{tier},Op}\right)$ :

•

$I D$ is a unique identifier (a binary word). It can be managed by a global institution such as the IEEE or by a local administrator. Ensuring the adequate identification of vertices is a key factor in the proposed solution.

•

tier is an integer number indicating the abstraction layer to which the component (vertex) belongs. Usually, such as in the OSI reference model, layers closer to the physical layer (hardware devices) present lower numbers than layers closer to final applications (users).

•

$O p$ is a list describing (naming) all the operations the component may perform. It can be composed of alphanumerical names, logical operation identifiers, etc. Hereinafter, $M=\textit{card}\left\{{Op}\right\}$ , being $\textit{card}\left\{\cdot\right\}$ an operator to indicate the calculation of the cardinality’s set; and $M$ the number of considered operations in our model.

On the other hand, edges are also characterized by two different parameters $\left({ID,\textit{trust}}\right)$ :

•

$I D$ mantains the same meaning as explained for vertices.

•

trust represents a number indicating the trustworthiness of the messages being transmitted through the link. In general, this value is calculated by the source vertex. If desired, this value may be understood as a weight of the communication link. In that case $G_{\mathfrak{T}}$ turns into a labeled graph or, being more specific, (as trust should take values from an ordered set, typically the real numbers) $G_{\mathfrak{T}}$ turns into a weighted graph.

Figure 3 shows a graph representing a simple Industry 4.0 system.

In a generic case, thousands of components and communication links make up a resource constrained environment – or Industry 4.0 system – (and, thus, the same number of vertices and edges composes the equivalent graph). It is not strange if some of these components or links (due to different reasons such as interferences, cyber-attacks, etc.) behave in an untrustworthy way. The objective of every module in the system, then, should be to avoid as much as possible relating or employing these components or links.

Figure 3.

Example of a graph representing an Industry 4.0 system.

In this context it is defined the absolute trust graph $T_{c_{i}}$ of a component $c_{i}$ as the subgraph of $G_{\mathfrak{T}}$ composed by the vertices and edges with a trustworthy behavior from the point of view of $c_{i}$ . Figure 4 shows a trust graph in color green.

The final objective for the component $c_{i}$ is to route its messages and packets and, in general, perform its actions using only elements belonging to this trust graph. The main obstacle in this challenge is the fuzzy management of these systems, where each component only has information about the modules that surround it (i.e. with which it shares a communication link). Therefore, it is especially important to define the trust circle of a component.

The trust circle ${\mathfrak{C}}_{c_{i}}$ of a component $c_{i}$ is a set containing all the components adjacent to it with a trustworthy behavior from the point of view of $c_{i}$ . Two components are considered adjacent if their corresponding vertices in $G_{\mathfrak{T}}$ are also adjacent in mathematical sense. Figure 4 shows a trust circle in color purple.

Figure 4.

Example of a graph with the proposed definitions for trust management.

With these ideas, each component should be able to calculate its own trust circle (as it has information about the adjacent nodes), and a mechanism by which this calculation also causes the obtaining of the trust graph must be defined (using, probable, a distributed procedure).

In all cases, however, the proposed “absolute definitions” are very restrictive; as components and communication links generally show different behaviors when different operations are performed. Thus, in order to maximize the use of the available resources in the system, components may employ a relative definition on the proposed concepts.

In that way, the relative trust graph $T_{c_{i}}^{op_{j}}$ of a component $c_{i}$ relative to the operation $op_{j}$ is defined as the subgraph of $G_{\mathfrak{T}}$ composed by the vertices and edges with a trustworthy behavior from the point of view of $c_{i}$ when performing the operation $op_{j}$ . The absolute trust graph is related to the relative trust graphs, as it is made of the elements present in all relative trust graphs. In particular:

$\displaystyle\forall v_{i}\in V_{G},v_{i}\in T_{c_{i}}\Leftrightarrow v_{i}\in V% _{T_{c_{i}}^{op_{j}}}j=1,\ldots,M$ $\displaystyle\forall e_{i}\in E_{G},e_{i}\in T_{c_{i}}\Leftrightarrow e_{i}\in E% _{T_{c_{i}}^{op_{j}}}j=1,\ldots,M$

The relative trust circle is defined in a similar way.

From the computational point of view, the calculation algorithms, the obtained absolute trust graph and the relative trust graphs depend all in the same way on the number of elements (components and links) in the system. Thus, they present the same complexity order in respect to the system topology and size (and, then, the same scalability properties, bottlenecks, etc.). Later, these dependencies will be discussed. However, if relative definitions are employed, the calculation process depends also on the number of available operations. In fact, computationally, there is no difference between calculating the absolute trust graph or any of the possible relative trust graphs, only the employed protocols or data format may change. Thus, the manner of obtaining all the possible relative trust graphs is to repeat the calculation for every available operation. In that way, the corresponding algorithm depends linearly on the number of available operations. This situation should be considered when estimating the refreshing period or the required computational time to perform the calculations.

Hereinafter we are describing a trust graph calculation process, which may be applied to obtain the global graph or any relative graph.

3.2 Probability and trust calculation models

Being $A$ the adjacent matrix of $G_{\mathfrak{T}}$ , the number of paths $p_{i,j}$ from vertex $v_{i}$ to vertex $v_{j}$ may be calculated as

$\displaystyle p_{i,j}=\sum_{r=1}^{K}\left[{A^{r}}\right]_{i,j}$

In its basic form, the adjacent matrix $A$ indicates if two vertexes are connected by an edge or not. If vertexes $v_{i}$ and $v_{j}$ are connected by an edge, the position $\left({i,j}\right)$ in $A$ is set to the unit; zero if they are not connected. Besides, in general, the $r-th$ pow of $A$ indicates if vertexes are connected through $r$ edges. Thus, the position $\left({i,j}\right)$ in $A^{r}$ is set to the unit if vertexes $v_{i}$ and $v_{j}$ are connected by $r$ edges; zero if they are not connected.

Then, to calculate the number of paths between two vertexes $p_{i,j}$ in a directed graph with $K$ edges, it must be evaluated if these vertexes are connected through all possible numbers of edges $r$ ; incrementing a counter each time vertexes are connected.

Equally, the number of paths that connect vertex $v_{j}$ to vertex $v_{i}$ may be obtained as

$\displaystyle p_{j,i}=\sum_{r=1}^{K}\left[{A^{r}}\right]_{j,i}$

Thus, the number of paths $p$ that allow connecting vertices $v_{i}$ and $v_{j}$ in a bidirectional way (i.e. cycles centered in $v_{i}$ and passing through $v_{j}$ ) is

$\displaystyle p_{j}^{i}=p_{i,j}\cdotp_{j,i}$

However, Industry 4.0 systems are characterized by presenting a lot of redundancies. Thus, a direct communication between particular low-level components and user applications is rather rare. Usually, connections are stablished with an entire platform containing several devices with similar functions. These platforms belong to a unique abstraction layer, so all devices (vertices) with similar functions present the same value for tier parameter. Considering that $\textit{tier}_{v_{j}}=n$ and a total of $T$ vertices present also the value $\textit{tier}=n$ , the number of paths $p^{i}$ whose trustworthiness have to be evaluated before obtaining the trust graph of vertex $v_{i}$ is

$\displaystyle p^{i}=\sum_{r=1}^{T}p_{r}^{i}$

Formally, the probability of a path $p t$ to be trustworthy may be calculated as the combination of the different edges (links) and vertices (components) making up the path to be trustworthy. As, in general, the trustworthiness of every edge and vertex is an independent event, the final probability can be obtained through a product ( $E$ is the number of edges composing the path)

$\displaystyle P\left[{\textit{pt trustworthy}}\right]$ $\displaystyle{}=\prod_{j=1}^{E}P\left[{e_{j}\textit{ trustworthy}}\right]\prod% _{j=1}^{E-1}P\left[{v_{j}\textit{ trustworthy}}\right]$

This approach, nevertheless, implies vertex $v_{i}$ to know every edge and component in the path, so it can obtain the global probability. In practical Industry 4.0 systems this assumption is in general not true, so vertex $v_{i}$ can only evaluate the trustworthiness of edges and vertices adjacent to it.

Let ${\mathfrak{N}}_{v_{i}}$ be the set of all vertices adjacent to $v_{i}$ in $G_{\mathfrak{T}}$ . Moreover, let ${\mathfrak{N}}_{v_{i}}^{s}$ be the set of all vertices adjacent to $v_{i}$ , so the edge joining these vertices starts in $v_{i}$ . Finally, ${\mathfrak{N}}_{v_{i}}^{t}$ is the set of all vertices adjacent to $v_{i}$ , so the edge joining these vertices finishes in $v_{i}$ . It can be easily proved that ${\mathfrak{N}}_{v_{i}}={\mathfrak{N}}_{v_{i}}^{s}\mathop{\cup}\nolimits{% \mathfrak{N}}_{v_{i}}^{t}$ .

Thus, in this work, it is said a vertex $v_{j}\in{\mathfrak{N}}_{v_{i}}^{s}$ is trustworthy along $v_{i}$ if, and only if, paths starting in $v_{i}$ and passing through $v_{j}$ are trustworthy. Equally, it is said a vertex $v_{j}\in{\mathfrak{N}}_{v_{i}}^{t}$ is trustworthy along $v_{i}$ if, and only if, paths finishing in $v_{i}$ and passing through $v_{j}$ are trustworthy.

In order to evaluate the trustworthiness of a path, whose elements (edges and vertices) are unknown, components are provided with the infrastructure that is shown in Fig. 5.

Each node maintains a Trust repository (TR) where trust information about the adjacent vertex and links is stored. In particular, the trust circle ${\mathfrak{C}}_{c_{i}}$ of the vertex $v_{i}$ and the probabilities for the vertex’ neighbors ${\mathfrak{N}}_{v_{i}}$ to be trustworthy are contained in the TR.

Information in the TR is generated and updated by a Trust Agency (TA). The TA represents a multi-agent system, including two different types of agents. Among its functions, TA must control the generation, behavior, temporization, design, composition, etc. of agents (but is not an agent). Basically, it orchestrates the agents’ operation. This component is unique, contrary to agents that are diverse, and isolates the agent coordination from details about trust graph calculation encapsulated into agents (specifically, into MaA).

Figure 5.

Trust calculation platform.

The management agent (MaA) is a static agent which controls the trust calculation process. It gets activated each time trust graphs have to be calculated or refreshed. Basically, it constructs the ephemeral data structures necessary to perform the calculation, activates the Mobile Agents (MoA) to visit the target vertices and computes the trust graph based on the behavior of the different paths when performing some simple operations named as “elemental requests”. These elemental requests are regular operation requests which have to be answered following a pre-programmed pattern. If received responses do not fit these patterns, the vertices through which messages are sent and/or received are suspected to be untrustworthy (see Section 3.3).

The use of mobile agents presents, in this context, three important advantages.

First, the use of mobile agents enables a fast a lightweight trust evaluation, as resource constrained nodes only must store and execute the proposed algorithm (which is also simple and computationally low cost) in an ephemeral and opportunistic manner. In that way, we are meeting the most important requirement and characteristic of Industry 4.0 systems (see Section 2).

Second, univocal vertex identification is enabled. In message passing solutions, nodes are usually identified by global communication identifiers which are location-dependent and ephemeral (such as IP addresses). Thus, any component moving around the system may appear with different identifiers at different moments which makes difficult to evaluate its trustworthiness in an efficient manner. Contrary to this approach, mobile agents may access to unique identifiers stored in components in a very easy manner. These identifiers do not change with the component location, and trustworthiness is then evaluated in a much more efficient way, even when components get in and get out the system dynamically.

Third, and finally, system scalability is enhanced. As trust evaluation is orchestrated by the TA, traffic engineering policies and other similar solutions to avoid network congestion synchronizing the operation of all components in the systems may be applied. On the contrary, traditional asynchronous message-passing solutions cause a traffic flood generating the system congestion when the number of interconnected components goes up.

Two additional considerations have to be taken into account.

First, in order to respond to the elemental requests (ERQ) with the previewed patterns, programming into components in the Industry 4.0 system should be modified. In order to avoid this problem, the TA also includes some MoA which visit the components of the system and respond to the ERQ with the pre-programmed patterns. These agents are triggered by the MaA, and also collect information about the transmission and execution process in order to improve the trust graph calculation. Agents are routed by the system following the standard routing rules, and get hosted in the vertices fulfilling a certain requirement (e.g. belonging a certain abstraction layer). If necessary, MoA can be replicated during routing to reach multiple destinations.

And, secondly, it has to be considered that absolute results are almost impossible to obtain. In that way, a trustworthy component will never return 100% of correct responses, random errors also occur and they do not have to affect the trustworthiness of nodes. In that way, two thresholds are defined $\theta_{\min}$ and $\theta_{\max}$ , being $0\leqslant\theta_{i}\leqslant 1$ . Thus, an element is considered trustworthy if, and only if, $P\left[\textit{trustworthy}\right]\geqslant\theta_{\max}$ . Besides, an element is considered untrustworthy if, and only if, $P\left[\textit{trustworthy}\right]\leqslant\theta_{\min}$ . For any other value between thresholds, different traffic engineering techniques (where trust probabilities may be considered as weights) or traffic filters may be considered.

Although traditional expert-created weights could also be employed, some important differences with probabilities must be considered. First, probabilities represent a real, dynamic and measurement-supported estimation; while expert-created weights tend to be fixed and based on previous experiences whose applicability may be limited. Thus, efficiency of traffic engineering techniques is improved when probabilities are employed as weight. And, second, probabilities have a direct physical meaning, while weights may present different consequences. That makes easier the configuration of probability-based solutions than expert-created weights mechanism. Furthermore, as probabilities may be calculated in a systematic and automatic manner, human intervention may be totally removed (which is not possible when using expert-created weights).

On the other hand, a final risk must be discussed. While traveling through the graph, agents could be infected, duplicated or (in general) turned into malicious objects. In that way, cybercriminals could introduce confusion in the system. Then, a solution to protect agents would be required. However, different characteristics make this risk almost negligible, and no additional mechanism to protect agents is actually needed. Namely:

Contrary to cyber-physical components, that are persistent elements in the system architecture, proposed agents are ephemeral objects with a very short lifetime. Thus, infected or duplicated agents cannot operate indefinitely, and their temporal impact is very limited.

Besides, agents may be provided with a (light- weight) digital signature to guarantee the integrity and authenticity of ERQ and patterns. Moreover, as agents are ephemeral objects, this signature does not have to be very complex, and proposed solution may be still supported by resource constrained devices.

3.3 Trust calculation process

Considering all the previous descriptions, the trust graph calculation process is as follows.

First, the MaA of $v_{i}$ vertex gets activated and triggers the MoA to reach the vertices characterized by a value $\textit{tier}=n$ (dst field). The selected value of tier parameter determines the “depth” of the calculated trust graph (which may include only one abstraction layer or – even – all of them).

$\displaystyle\textit{MoA}=\left\{{\textit{dst}:\left\{{\textit{tier}:n}\right% \},sq_{\textit{number}},\textit{code}}\right\}$

The mobile agent is transmitted by means of an epidemic protocol through the system: each component receiving the agent transmits it to all its neighbors. Moreover, each mobile agent is marked with a unique sequence number (and, of course, the code to be executed). In order to avoid duplicated transmissions, a vertex only transmits agents with new sequence numbers (in other cases transmission is discarded).

Once the MoA arrives to a vertex fulfilling the requirements described in the dst filed, it gets installed.

At the same time, MaA triggers a timer. When time is over it generates a collection of elemental requests

$\displaystyle\textit{ERQ}=\left\{{\textit{vertex}_{\textit{src}},rq_{\textit{% msg}},sq_{\textit{number}},\textit{timestamp}}\right\}$

Each elemental request describes the source vertex (the vertex through which the root vertex $v_{i}$ sends the request), which must fulfill that $\textit{vertex}_{\textit{src}}\in{\mathfrak{N}}_{v_{i}}^{s}$ ; the body (or message) of the request; and the sequence number, indicating the MoA which should answer the request (various calculation processes may be performed in parallel). In this case, as different paths may pass through a same vertex and various sequences of ERQ may be triggered, every elemental request (independently of the sequence number) is routed and transmitted. In order to make the proposed solution scalable, an epidemic protocol is also employed in this case. As the graph might include cycles, elemental requests are provided with a timestamp. Requests older than a pre-defined limit are discarded (so the system does not get overloaded).

Table 1
Double entry table constructed by MaA

Percentage of trustworthy received		Target vertex			Probability of trustworthiness
messages ${\Psi}\left[{s,t,x=\textit{true}}\right]$		$v_{1}^{t}$	…	$v_{\textit{card}\left({{\mathfrak{N}}_{v_{i}}^{t}}\right)}^{t}$
Source vertex	$v_{1}^{s}$	$\psi_{1,1}$	…	$\psi_{1,\textit{card}\left({{\mathfrak{N}}_{v_{i}}^{t}}\right)}$	$\frac{1}{\textit{card}\left({{\mathfrak{N}}_{v_{i}}^{t}}\right)}\mathop{\sum}% \limits_{j}\psi_{1,j}$
	…	…	…	…	…
	$v_{\textit{card}\left({{\mathfrak{N}}_{v_{i}}^{s}}\right)}^{s}$	$\psi_{\textit{card}\left({{\mathfrak{N}}_{v_{i}}^{s}}\right),1}$	…	$\psi_{\textit{card}\left({{\mathfrak{N}}_{v_{i}}^{s}}\right),\textit{card}% \left({{\mathfrak{N}}_{v_{i}}^{t}}\right)}$	$\frac{1}{\textit{card}\left({{\mathfrak{N}}_{v_{i}}^{t}}\right)}\mathop{\sum}% \limits_{j}\psi_{\textit{card}\left({{\mathfrak{N}}_{v_{i}}^{s}}\right),j}$
Probability of trustworthiness		$\frac{1}{\textit{card}\left({{\mathfrak{N}}_{v_{i}}^{s}}\right)}\mathop{\sum}% \limits_{j}\psi_{j,1}$	…	$\frac{1}{\textit{card}\left({{\mathfrak{N}}_{v_{i}}^{s}}\right)}\mathop{\sum}% \limits_{j}\psi_{j,\textit{card}\left({{\mathfrak{N}}_{v_{i}}^{t}}\right)}$

When an ERQ is received by a MoA (hosted in a vertex belonging to the n-th tier) it responds following a pre-defined pattern. For example, if MoA is visiting a temperature sensor, it generates an answer (known as “elemental answer”, EA) containing a preset value instead of the real temperature which is unknown by upper components. The EA also includes the original ERQ.

$\displaystyle EA=\left\{{\textit{ERQ},\textit{ans}_{\textit{msg}}}\right\}$

The EA are sent to the root vertex $v_{i}$ using also an epidemic protocol, in order to traverse all the possible paths. Each MoA gets hosted in the target vertices during a certain time (answering ERQ), and, when this time is over, they return to the root vertex.

When the root vertex receives the EA, it completes the information with the target vertex information (the vertex through which the root vertex $v_{i}$ receives the response), $\textit{vertex}_{\textit{trg}}\in{\mathfrak{N}}_{v_{i}}^{t}$ .

$\displaystyle EA=\left\{{\textit{ERQ},\textit{ans}_{\textit{msg}},\textit{% vertex}_{\textit{trg}}}\right\}$

When $v_{i}$ vertex has received all the information, the MaA constructs a double entry table (see Table 1). As EA should follow the pre-defined patterns, the MaA may evaluate the correctness (the trustworthiness) of the received information. Each cell in the double entry table is, then, completed with the corresponding percentage of trustworthy received messages. Each row is associated with a source vertex and each column with a target node.

The resulting double entry table, then, may be understood as a biparametric stochastic process ${\Psi}[s,t,\linebreak x]$ . Both parameters, $s$ and $t$ , represent the source $s$ and target $t$ vertices of the path to be studied. Random variable $X$ is a binary variable indicating the event of the path starting in $s$ and finishing in $t$ to trust ( $x=\textit{true}$ ) or not ( $x=\textit{false}$ ). The created table would contain the process realization for $x=\textit{true}$ .

Then, each realization $\psi_{i,j}$ may be understood as the percentage of correct answers received through paths starting in $v_{i}^{s}$ and finishing in $v_{j}^{t}$ . In order to obtain the probability of trustworthiness of each source and target node, the statistical mean value is obtained (having all cases the same probability) considering percentages for all paths starting or finishing in a certain vertex.

As percentages in the same row or column are already calculated considering the same denominator (the number of performed ERQ), it is guaranteed the global probability add up to the unit. Thus, all incoming edges (on the one hand), and all outcoming edges (on the other hand) have associated probabilities meeting the basic probabilistic theory assumption (probabilities of all considered events add up to the unit). Edges in the global trust graph are not required to fulfill any requirement a priori, as they are (in the general case) independent components.

The obtained information (as it only describes the beginning and end of every path), it is not enough to obtain directly the trust graph. Nevertheless, as said in Section 3.2, the first step in the trust graph calculation process is to obtain the trust circle of $v_{i}$ . In order to do that, the trustworthiness of the individual nodes has to be obtained. As said, a vertex is trustworthy if, an only if, paths starting or finishing in it are also trustworthy. Thus, considering the created table contains samples of an unknown random variable (as in any other regular laboratory experiment), the most appropriate way of estimating the trustworthiness of vertices is the sample mean (see Table 1). That is, the probability of a vertex to be trustworthy it is obtained as the sample mean of the probabilities associated with each one of the paths starting or finishing in that node.

$\displaystyle\bar{X}=\frac{1}{n}\sum_{j}X_{j}$

Considering an epidemic protocol transmits in all directions with the same probability, every sample in the sample mean is weighted by the same coefficient.

Computation of these values is a very easy and fast procedure, so sources are much better used than in traditional statistical frameworks for trust management. It must be noted (besides) that, in all this calculation process, the trustworthiness of a link is considered together the trustworthiness of the target vertex of that link.

Once the probability of being trustworthy is calculated for every adjacent vertex to the component under study, various possibilities appear. First, if the probability to be trustworthy of a vertex is above the threshold $\theta_{\max}$ , the component is considered trustworthy and all paths starting (or finishing) in this vertex are included in the trust graph. In the same way, if the probability to be trustworthy of a vertex is below the threshold $\theta_{\min}$ , the component is considered untrustworthy and all paths starting (or finishing) in this vertex are removed from the trust graph.

In order to remove untrustworthy paths, the initial and/or final vertices have to be pruned. In Industry 4.0 systems, components cannot remove any peer, as these privileges are reserved to the system administrator. In that way, pruning algorithm only removes the edge that connects the root vertex $v_{i}$ with the initial/final of the path to be pruned. No additional edges are created to maintain the connectivity. If the pruned path was the only outgoing/ingoing path, an alarm may be triggered to inform user and/or administrators about the situation.

For any other value $\theta$ that $\theta_{\min}\leqslant\theta\leqslant\theta_{\max}$ , a new calculation process should be triggered. Considering the information provided by MoA, the MaA decides if untrustworthiness elements are located in the downlink (from root vertex to vertex hosting the MoA) or in the uplink. In the first case, source vertex is requested to calculate its own trust graph. In second case, target vertex is equally requested. If a decision cannot be made about this issue, both vertices are directly requested to evaluate the trust graph.

Once requested nodes have finished the calculation process (following the same described procedure), they notify the root vertex this event with an acknowledgement message (ACK). This ACK has to indicate also if some paths have been pruned or no changes have been performed. In the first case, the probability to be trustworthy of the affected vertices is refreshed following the same procedure described above. If obtained probabilities are unclear another time, these values are considered, anyway, as the final results of the calculation process. In the second case no more actions are performed.

In any case, probabilities and trust graph and trust circle are refreshed periodically, in a continuous actualization process. The first time this process is triggered is after all ACK are received, and all untrustworthy paths pruned. From this point, the procedure is repeated periodically.

With the obtained results, the intermediate values are employed as weights to perform traffic engineering in the system and/or to include annotations (using, for example, Semantic Annotations) about the trustworthiness of the received messages.

The use of epidemic protocols makes the proposed technology very scalable, as the system get “flooded” in a fast way. However, depending on the system topology (more than on the number of components in the system), if several paths start/end in the same vertex and an untrustworthy component is located very deeply in the system, the calculation process may imply several steps and require more processing time than expected.

4. Experimental validation

In order to illustrate the behavior of the proposed algorithm, in this section we perform and provide some relevant simulations.

Simulations were performed using the (commercial) NS3 simulator, which implemented a platform based on TAP bridges and virtualization. The proposed platform turns independent the application layer and the network layer. The network layer is supported by NS3 simulator, where different libraries describing gossip protocols, mobile communications, wireless links, etc. are available. On the other hand, application layer is based on Linux paravirtualization, as native C programming is the most adequate to implement.

In particular, NS3 scenarios recreate very similar conditions to real networks, and using the described platform, nodes in these simulation scenarios may exchange data and implement applications as real hosts do.

The simulation scenarios are supported by TAP bridges, which connect virtual Linux machines generated using LXC technologies and the Libvirt library with the NS3 simulator, and ghost nodes, the virtual entity that represent the virtual machine in the network simulator. All parameters are provided using a configuration script, where nodes, network interfaces, connections and applications are described. Although this approach is more complex than the configuration of regular NS3 simulations, it is a more flexible design. TAP bridges are configured as “ConfigureLocal”. In that way, TAP bridges are generated in a row, inheriting the configuration properties one from the other. In order to orchestrate in the most efficient way the different parallel processes that require this type of simulations, a desktop application was created (see Fig. 6).

Figure 6.

Graphic management application.

The employed simulation scenario consisted of an Industry 4.0 system composed of 250 components, distributed as follows:

•

Physical layer (175 devices). It includes 100 smart transducers (temperature sensors and buzzers) based on Atmel microprocessors. Moreover, 70 Programmable Logic Controller (PLC) supported by Samsumg Artik devices are simulated. Finally, 5 robotic arms are simulated.

•

Control and data analytics (75 components). Twenty-five network devices among routers, bridges and hubs were considered. Besides 50 different middleware were also included (most of them focused on data translation).

•

User applications [25]. Finally, 25 user applications obtaining information from physical platform and representing it on a graphic interface are simulated.

Although, in a real scenario, components would present heterogenous characteristics. This situation is a pending challenge which is not addressed in this paper. In order to evaluate the performance of the proposed solution, all components are provided with a mobile agent framework: Mobile-C [67]. Agents are described using XML documents containing the proposed algorithms in C code. In a real application, the described solution should be combined with other software engineering mechanisms to enable the implementation of mobile agents on heterogeneous CPS [60].

Two different scenarios were created with these devices, each one with a different system topology: (1) total connectivity and (2) branched graph.

In the described scenarios, the proposed algorithm for trust graph calculation was implemented and deployed. In particular, user applications were programmed to include this solution and perform calculations.

The total simulated time was two days (48 hours), for which a 15 hours simulation was required. Each simulation was repeated five times in order to compensate the possible random effects.

At $t_{0}=$ 0 five random components were programmed to behave in an untrustworthy way. During simulation, other components could also turn untrustworthy depending on random effects. Different (3) cyber-attacks were also considered and planned at $t_{1}=$ 12 h, $t_{2}=$ 24 h and $t_{3}=$ 36 h.

The numeric values assigned to the parameters described in the proposed technology are included in Table 2. Values are only illustrative, and they are not optimized or specifically designed for any particular situation. These two values may be modified in an independent manner, as they are two non-related variables. Although both parameters may be fixed as desired, in order to maximize the success rate, it is advisable to study the application scenario before selecting values. However, some important general remarks should be done.

In extremely secure or trustworthy scenarios, $\theta_{\max}$ and $\theta_{\min}$ parameters would tend to be very close and have low values. In that way, only components with several negative reports will be declared malicious. On the other hand, in high-risk scenarios, both values will be high; as only one doubt about the trustworthiness of components is considered enough to prune this path. Using this two configurations, the calculation time will be severely reduced, as almost no recalculations will be done ( $\theta_{\max}$ and $\theta_{\min}$ are very close, see Section 3), and paths will be classified during the first evaluation.

However, when no information about the scenario is available, all possibilities should be considered to have the same probability, and $\theta_{\max}$ and $\theta_{\min}$ parameters should be homogeneously distributed. In this case, several recalculations will be done, and required time to obtain a stable trust graph is higher. This “worst case” is considered in this experimental validation. For any other situation, performance will be improved.

Table 2

Numeric values of the solution parameters

Parameter	Value	Comments
$\theta_{\max}$	0.75	Values in the upper quartile are considered to be trustworthy
$\theta_{\min}$	0.25	Values in the lower quartile are considered to be untrustworthy
Timestamp limit	180 s	ERQ older than three minutes are discarded

Simulations are employed to perform four experiments.

In the first experiment, we evaluate the convergence time of the proposed solution and the correctness of the calculated trust graphs.

In the second experiment, obtained results from the proposed solution are compared to results obtained from state-of-the-art proposals. In particular, although no similar solution has been reported, other techniques with similar purposes may be found. In this case we have selected one of the most popular proposals: reputation and trust models [26, 27]. This approach employs direct and indirect observations to evaluate if cyber-physical components are malicious and, then, isolate them.

In the third experiment, the solution performance is evaluated in a dynamic Industry 4.0 system (where components get in and out of the system) and compared to an equivalent traditional message-passing solution. Basically, the mean convergence time is evaluated. Components may appear at any available location (logical or physical) randomly. Different behavioral patterns are considered. In particular, the probability for a component to leave or enter the system follows a gaussian distribution with time, whose mean value (or average stay time) is varied in each experiment realization.

Finally, in the fourth experiment, the global system overload caused by trust calculation and graph pruning is measured.

5. Results

In respect to the first experiment, Fig. 7 shows the evolution in the use of the untrustworthy paths. As can be seen, the use rate of the untrustworthy paths follows, for both system topologies, a sigmoid-like evolution.

Figure 7.

Use of the untrustworthy paths.

If a branched graph topology is considered, only twenty [20] seconds are required to isolate (almost totally) the untrustworthy paths. However, as the number of edges and links goes up, more time is required to traverse all possible paths, and more calculation time is needed to obtain the final trust graph. In that way, as can be seen, around sixty [60] seconds are necessary to locate all the untrustworthy paths in a system with total connectivity. Although this period may be too high for traditional Internet scenarios, Industry 4.0 applications inherit some special characteristics from current industrial systems. In particular, broadband communications are not usually deployed and components tend to exchange few messages (as production tasks usually take some time to finish). Thus, the potential impact of untrustworthy behaviors goes up much more slowly than in traditional Internet scenarios.

It must be noted that, once equilibrium is reached it is stable. Small variations of the curves are due to the statistical (stochastic) nature of the proposed technology.

Results in Fig. 8 shows the success rate in locating and pruning the untrustworthy paths depending on the system topology and the type of untrustworthiness considered.

Figure 8.

Success rate.

As can be seen, in general systems with a branched topology allow a higher success rate. In these graphs, in fact, traversing all possible paths is easier, as it is possible to define an ordered way to analyze the topology. Any case, in all cases success rate is above 80%. As can be seen, this difference is obvious for initial and random trustworthiness, but smaller for cyber-attacks. Any case, all values are enough high to consider the proposed solution a successful technique.

Initial untrustworthiness is detected in a more successful way, and untrustworthiness appearing in a random way are the most difficult to locate. Cyber-attacks, as they are usually pretty “noisy”, they are detected and untrustworthy paths pruned in a more successful manner.

It is also important to remark that the proposed technique is applicable to both scenarios (branched graph and total connectivity graph), so Fig. 8 does not represent a comparison to show the obtained improvement. Below we are doing this analysis.

Once presented results from first experiment, we are discussing results from the second experiment. Figure 9 shows a comparison between results obtained from our proposal, and existing solutions’ performance [27, 63]. It is important to remark that both approaches are not equivalent, as application scenarios are not exactly the same, but this initial comparison may be employed to obtain first evidence about the obtained improvements.

Figure 9.

Use of the untrustworthy paths (comparison to the state of the art).

In order to show results in the most adequate manner, X-axis is represented in logarithmic form. Using this method, differences between the proposed solution and existing technologies are clearly shown, but comparisons between results for the different scenarios are hard to be analyzed. Considering this fact, some relevant initial conclusions may be done. First, the proposed solution is up to three magnitude orders faster than existing technologies [27] based on reputation and trust models. The main cause for this improvement is the different and new approach to evaluate the path trustworthiness. While in trust and reputation models [63] observations are passively collected during system operation, in the proposed mechanism a proactive evaluation procedure is considered. Thus, previous proposals [63] require a lot of time (up to fifty hours, depending on the number of carried out transactions among components per time unit) to collect enough information to make a decision and prune untrustworthy paths. On the contrary, proposed mechanism performs an exhaustive proactive evaluation procedure which detects malicious components and paths in few minutes.

And, second, the scalability of the proposed technique is much better than existing message-passing approaches. In fact, as can be seen in Fig. 9, when a branched graph increases its interconnections and turns into a total connectivity graph, previous mechanisms increment the required time to locale all malicious components and untrustworthy paths in two magnitude orders. Nevertheless, the proposed solution only duplicates that time. Therefore, using the proposed solution, the scalability curve has transformed from an exponential function to a linear law.

Previous experiment compares the proposed solution to existing trust evaluation technologies. On the contrary, third experiment compares the proposed agent-based solution to the equivalent message-passing algorithm. Besides, in this case, a more realistic scenario is considered, where components leave and enter the system dynamically. Figure 10 shows the obtained results for this experiment.

Figure 10.

Convergence time (comparison to message-passing algorithm).

As can be seen, both solutions (agent-based and message-passing) have a pretty similar behavior when components remain in the same position for a long period (long enough to evaluate trust completely). In that case, both algorithms behave similar, and convergence time is around values showed in Fig. 7. However, this situation totally changes when components leave and enter dynamically the system. In that case, the use of agents is a key feature for a fast and efficient trust evaluation.

In fact, as ERQ do not include any information about the node identity, in the equivalent message-passing algorithm, identifiers from the communication subsystem must be employed. These identifiers are ephemeral, and trust is not correctly evaluated until each component remains enough time in one specific location. This requirement is more costly as components change their location more frequently, as can be seen in Fig. 10. Convergence time for the message-passing algorithm is up to two magnitude orders higher. On the contrary, agents are hosted by the component (vertex) under study, so it may be univocally recognized using serial numbers or other similar information which is globally unique. In that way, even if components change their location, mobile agents recognize the components; and convergence time is independent from the location and only depends on the time the components remain connected to the system. Any case, when components are moving around the system, convergence time increases around 400% (although the specific amount depends on the behavior pattern of components).

Finally, results from the fourth experiment are analyzed to prove the proposed solution is lightweight. Two different indicators are measured to evaluate the overload in the system due to trust calculation: memory usage and consumed processing time. Table 3 shows the obtained measures.

Table 3

Resource consumption

Scenario	Period	Overload indicators
		Memory	Processing
		usage	time
Branched graph	Initial calculation	9.8%	11.2%
	Actualization	6.7%	2%
Total connectivity	Initial calculation	12.2%	16.1%
	Actualization	8%	3.9%

As can be seen, in all cases, the overload is higher during the initial calculation than during the (continuous) actualization processes. Any case, the mean aggregated overload is lower than 10% (around 9%). This value goes down to 5% (a reduction around 50%) during the continuous actualization process. In respect to memory usage, the overload is around 10%, although it may reach up to 12% for total connectivity graphs during the initial calculation. These values are a little bit higher for processing time, appearing an overload around 16% for total connectivity graphs during the initial calculation.

These values, for small resource constrained devices, show that the proposed technology is lightweight, as said before. In fact, previously reported lightweight security (encryption) techniques are proved to consume similar amounts of resources (an overload around 10%) [64]. Besides, other similar solutions (based on trust and reputation models) [63] are explicitly said to be implemented in powerful software components, as algorithms are complex and costly.

Actually, if common simple data gathering programs and algorithms are considered in sensor nodes, the generated overload by our solution is almost negligible in absolute vales. However, if complex algorithms and costly software is being already executed, this overload could be excessive; but this situation is inevitable for any proposed security solution.

6. Conclusions

In this work, a trust management technology to protect resource constrained systems against malicious components is proposed.

A mathematical formalization describing engineered resource constrained systems (particularized in Industry 4.0 systems) as directed graphs is described. Using the proposed notation and formalization, a model for trust in Industry 4.0 systems, defining important concepts such as the trust graph (absolute or relative) and the trust circle, is proposed. A probabilistic model based on a biparametric stochastic process is also described.

A platform based on two different types of agents support the calculation of the trustworthy paths in the graph describing an Industry 4.0 system. Untrustworthy paths are pruned and situations where a clear decision cannot be made, are solved using traffic engineering techniques.

Simulation results showed a success rate above 80% in all cases. Besides, a maximum calculation time of 600 seconds is required to locate all possible untrustworthy paths in total connectivity graphs. Comparisons to the state-of-the-art solutions show an improvement in the required calculation time and scalability. Besides, the generated system overload proves the proposed solution to be lightweight.

From the engineering point of view, different challenges must be faced when implementing the proposed solution. First, specific programming solutions for CPS must be employed, in order to solve problems such as timing (in our case the specific C-based language provided with microcontrollers was employed). Second, sometimes, mathematical expressions cannot be directly applied due to hardware characteristics (for example, no floating-point operation could be allowed). Then, adaptations are required. And, third, solutions to send mobile agents to all nodes under study must be coordinated with self-configuration technologies. That’s basic to share network addresses, communications ports, etc., but if built-in communication protocols are employed, it may require an additional effort.

Finally, the system topology is a basic variable in the proposed technology, so future works should consider this variable in the calculation process in order to optimize as much as possible the use of system resources (which, in Industry 4.0 systems, use to be scarce). In this sense, the proposed solution could be improved from a computational point of view if visibility graphs are employed [65]. Future works will also consider this technology.

Future works will also describe some specific solution for situations where the trustworthiness of paths is unclear.

Footnotes

Acknowledgments

Authors received funding from the Ministry of Economy and Competitiveness through SEMOLA project (TEC2015-68284-R).

References

Lee

. Cyber-physical systems-are computing foundations adequate. In: Position Paper for NSF Workshop On Cyber-Physical Systems: Research Motivation, Techniques and Roadmap (Vol. 2). 2006.

Varghese

Tandur

. Wireless requirements and challenges in Industry 4.0. In: 2014 International Conference on Contemporary Computing and Informatics (IC3I). 2014. pp. 634-638.

Robles

Alcarria

Martín de Andrés

Navarro

Calero

Iglesias

, et al. An IoT based reference architecture for smart water management processes. JoWUA. 2015; 6(1): 4-23.

Martín de Andrés

Alcarria

Robles

Morales Dominguez

. A Systematic Approach for Service Prosumerization in IoT Scenarios. In: Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS). 2013. pp. 494-499.

Mosterman

Zander

. Industry 4.0 as a cyber-physical system study. Software & Systems Modeling. 2016; 15(1): 17-29.

Lee

Bagheri

Kao

. A cyber-physical systems architecture for industry 4.0-based manufacturing systems. Manufacturing Letters. 2015; 3: 18-23.

Bordel Sánchez

Alcarria

Martín

Robles

. TF4SM: a framework for developing traceability solutions in small manufacturing companies. Sensors. 2015; 15(11): 29478-29510.

Yampolskiy

Horváth

Koutsoukos

Xue

Sztipanovits

. A language for describing attacks on cyber-physical systems. International Journal of Critical Infrastructure Protection. 2015; 8: 40-52.

Yampolskiy

Horvath

Koutsoukos

Xue

Sztipanovits

. Taxonomy for Description of Cross-Domain Attacks on CPS. In: Proceedings of the Second ACM International Conference on High Confidence Networked Systems. 2013. pp. 135-142.

10.

Wang

Zhang

Neri

Jiang

Zhao

Gheorghe

Lefticaru

. Design and implementation of membrane controllers for trajectory tracking of nonholonomic wheeled mobile robots. Integrated Computer-Aided Engineering. 2016; 23(1): 15-30.

11.

Bordel

Alcarria

Sánchez-de-Rivera

. Detecting Malicious Components in Large-Scale Internet-of-Things Systems and Architectures. In: World Conference on Information Systems and Technologies. Springer, Cham. 2017. pp. 155-165.

12.

Jazdi

. Cyber physical systems in the context of Industry 4.0. In: Automation, Quality and Testing, Robotics, 2014 IEEE International Conference on. IEEE. 2014, May. pp. 1-4.

13.

Bordel

Alcarria

Martín

Robles

de Rivera

. Self-configuration in humanized Cyber-Physical Systems. J. Ambient Intelligence and Humanized Computing. 2017; 8(4): 485-496.

14.

Sánchez

Alcarria

Sánchez-Picot

de Rivera

. A methodology for the design of application-specific cyber-physical social sensing co-simulators. Sensors. 2017; 17(10): 2177.

15.

Hofmann

Rüsch

. Industry 4.0 and the current status as well as future prospects on logistics. Computers in Industry. 2017; 89: 23-34.

16.

Sommer

. Industrial revolution-industry 4.0: Are German manufacturing SMEs the first victims of this revolution? Journal of Industrial Engineering and Management. 2015; 8(5): 1512-1532.

17.

Shrouf

Ordieres

Miragliotta

. Smart factories in Industry 4.0: A review of the concept and of energy management approached in production based on the Internet of Things paradigm. In Industrial Engineering and Engineering Management (IEEM), 2014 IEEE International Conference on. IEEE. 2014, December. pp. 697-701.

18.

Lee

Bagheri

Kao

. A cyber-physical systems architecture for industry 4.0-based manufacturing systems. Manufacturing Letters. 2015; 3: 18-23.

19.

Tang

Sun

Liu

Wang

. Resource leveling based on line of balance and constraint programming. Computer – Aided Civil and Infrastructure Engineering. 2018; 33(10): 864-884.

20.

Bordel

Alcarria

de Rivera

Robles

. Process execution in Cyber-Physical Systems using cloud and Cyber-Physical Internet services. The Journal of Supercomputing. 2018; 1-43.

21.

Sánchez

Alcarria

de Rivera

Sánchez-Picot

. Enhancing process control in industry 4.0 scenarios using cyber-physical systems. JoWUA. 2016; 7(4): 41-64.

22.

Chen

Wang

. A cloud-based trust management framework for vehicular social networks. IEEE Access. 2017; 5: 2967-2980.

23.

Wang

Wan

Zhang

. Towards smart factory for industry 4.0: a self-organized multi-agent system with big data based feedback and coordination. Computer Networks. 2016; 101: 158-168.

24.

Yan

Zhang

Vasilakos

. A survey on trust management for internet of things. Journal of Network and Computer Applications. 2014; 42: 120-134.

25.

Geetha

. Social internet of things. World Scientific News. 2016; 41-76.

26.

Bao

Chen

Chang

Cho

. Hierarchical trust management for wireless sensor networks and its applications to trust-based routing and intrusion detection. IEEE Transactions on Network and Service Management. 2012; 9(2): 169-183.

27.

Bao

Chen

. Trust management for the internet of things and its application to service composition. In: IEEE International Symposium on World of Wireless, Mobile and Multimedia Networks (WoWMoM). 2012. pp. 1-6.

28.

Daly

Haahr

. Social network analysis for information flow in disconnected delay-tolerant MANETs. IEEE Transactions on Mobile Computing. 2009; 8(5): 606-621.

29.

Zhang

Selic

Ali

Yue

Okariz

Norgren

. Understanding Uncertainty in Cyber-Physical Systems: A Conceptual Model. In: Modelling Foundations and Applications: 12th European Conference, ECMFA 2016; 2016 July 6-7. Vienna, Austria: Springer. 2016. pp. 247-264.

30.

Hasan

Curry

. Approximate semantic matching of events for the internet of things. ACM Transactions on Internet Technology (TOIT). 2014; 14(1): 2.

31.

Aggarwal

Ashish

Sheth

. The internet of things: A survey from the data-centric perspective. In: Managing and Mining Sensor Data. Springer US. 2013. pp. 383-428.

32.

Chen

Chang

Sun

Jia

Wang

. TRM-IoT: A trust management model based on fuzzy reputation for internet of things. Computer Science and Information Systems. 2011; 8(4): 1207-1228.

33.

Truong

Lee

. A reputation and knowledge based trust service platform for trustworthy social internet of things. In: 19th Conference in Innovations in Clouds, Internet and Networks (ICIN); 2016 March 1-3. Paris, France. 2016.

34.

Xiao

Sidhu

Christianson

. Guarantor and reputation based trust model for social internet of things. In: International Wireless Communications and Mobile Computing Conference (IWCMC); 2015 Aug 24-28. Dubrovnik, Croatia. 2015. pp. 600-605.

35.

Guha

Kumar

Raghavan

Tomkins

. Propagation of trust and distrust. In: Proceedings of the 13th international conference on World Wide Web; 2004 May 17-22. New York, USA; ACM. 2004. pp. 403-412.

36.

Jiang

Baras

. Graph algebraic interpretation of trust establishment in autonomic networks. In: Preprint Wiley Journal of Networks. Wiley. 2009.

37.

Trajdos

Kurzynski

. A correction method of a binary classifier applied to multi-label pairwise models. International Journal of Neural Systems. 2018; 28(9): 1750062.

38.

Golbeck

Parsia

Hendler

. Trust networks on the semantic web. In: Proceedings of the 7th International Workshop on Cooperative Information Agents (CIA-2003). 2003; pp. 238-249.

39.

Shmatikov

Talcott

. Reputation-based trust management. Journal of Computer Security. 2005; 13(1): 167-190.

40.

Ziegler

Lausen

. Spreading activation models for trust propagation. In: Proceedings of IEEE International Conference on e-Technology, e-Commerce and e-Service; 2004 March 28-31. Taipei, Taiwan. IEEE. 2004. pp. 83-97.

41.

Theodorakopoulos

Baras

. On trust models and trust evaluation metrics for ad hoc networks. IEEE Journal on selected areas in Communications. 2006; 24(2): 318-328.

42.

Rostami

Neri

. Covariance matrix adaptation pareto archived evolution strategy with hypervolume-sorted adaptive grid algorithm. Integrated Computer-Aided Engineering. 2016; 23(4): 313-329.

43.

Blaze

Feigenbaum

Lacy

. Decentralized trust management. In: Proceedings of the IEEE Symposium on Security and Privacy; 1996 May 6-8. California, USA. IEEE. 1996. pp. 164-173.

44.

Cornelli

Damiani

di Vimercati

SDC

Paraboschi

Samarati

. Choosing reputable servents in a P2P network. In: Proceedings of the Eleventh International World Wide Web Conference; 2002 May 07-11. Honolulu, USA. ACM. 2002. pp. 376-386.

45.

Liu

. A novel efficient learning algorithm for self-generating fuzzy neural network with applications. International Journal of Neural Systems. 2012; 22(1): 21-35.

46.

Kretzschmar

Stachniss

Grisetti

. Efficient information-theoretic graph pruning for graph-based SLAM with laser range finders. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); 2011 Sep. 25-30. San Francisco, USA. IEEE. 2011. pp. 865-871.

47.

Dai

. Distributed dominant pruning in ad hoc networks. In: Proceedings of the IEEE International Conference on Communications; 2003 May 11-15; Anchorage, Alaska, USA. IEEE. 2003. pp. 353-357.

48.

Rostami

Neri

Epitropakis

. Progressive preference articulation for decision making in multi-objective optimisation problems. Integrated Computer-Aided Engineering. 2017; 24(4): 315-335.

49.

Takimoto

Warmuth

. Predicting nearly as well as the best pruning of a planar decision graph. Theoretical Computer Science. 2002; 288(2): 217-235.

50.

Navigli

Velardi

Faralli

. A graph-based algorithm for inducing lexical taxonomies from scratch. In: International Joint Conference on Artificial Intelligence; 2011 July 16-22. Barcelona, Spain. 2011. pp. 1872-1877.

51.

Sixtus

Ortmanns

. High quality word graphs using forward-backward pruning. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing; 1999 March 15-19. Phoenix, Arizona, USA. IEEE. Vol. 2. 1999. pp. 593-596.

52.

Kotz

Gray

. Mobile agents and the future of the internet. Operating Systems Review. 1999; 33(3): 7-13.

53.

Godfrey

Jha

Nair

. On a mobile agent framework for an internet of things. In: Communication Systems and Network Technologies (CSNT), 2013 International Conference on. IEEE. 2013, April. pp. 345-350.

54.

Fortino

Guerrieri

Russo

Savaglio

. Integration of agent-based and cloud computing for the smart objects-oriented IoT. In: Computer Supported Cooperative Work in Design (CSCWD), Proceedings of the 2014 IEEE 18th International Conference on. IEEE. 2014, May. pp. 493-498.

55.

Cimellaro

Mahin

Domaneschi

. Integrating a human behavior model within an agent – based approach for blasting evacuation. Computer – Aided Civil and Infrastructure Engineering. 2019; 34(1): 3-20.

56.

Bahrami

. Design of a mobile agents based solution to distributional management of computer networks, taking into account the security mechanisms. Engineering, Technology & Applied Science Research. 2017; 7(5): 2026-2029.

57.

Vigna

. (Ed.). Mobile agents and security (Vol. 1419). Springer. 2003.

58.

Derler

Lee

Vincentelli

. Modeling cyber-physical systems. Proceedings of the IEEE. 2012; 100(1): 13-28.

59.

Rajhans

Bhave

Ruchkin

Krogh

Garlan

Platzer

Schmerl

. Supporting heterogeneity in cyber-physical systems architectures. IEEE Transactions on Automatic Control. 2014; 59(12): 3178-3193.

60.

Lin

Sedigh

Hurson

. An agent-based approach to reconciling data heterogeneity in cyber-physical systems. In: Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on. IEEE. 2011, May. pp. 93-103.

61.

Chess

Harrison

Kershenbaum

. Mobile agents: Are they a good idea? In: International Workshop on Mobile Object Systems. Springer, Berlin, Heidelberg. 1996, July. pp. 25-45.

62.

Wan

Chen

Xia

Zhou

. From machine-to-machine communications towards cyber-physical systems. Computer Science and Information Systems. 2013; 10(3): 1105-1128.

63.

Bordel

Alcarria

De Andrés

You

. Securing Internet-of-Things systems through implicit and explicit reputation models. IEEE Access. 2018; 6: 47472-47488.

64.

Bordel

Orúe

Alcarria

Sánchez-de-Rivera

. An intra-slice security solution for emerging 5G networks based on pseudo-random number generators. IEEE Access. 2018; 6: 16149-16164.

65.

Ahmadlou

Adeli

. Visibility graph similarity: A new measure of generalized synchronization in coupled dynamic systems. Physica D: Nonlinear Phenomena. 2012; 241(4): 326-332.

66.

Lee

Bagheri

Kao

. A cyber-physical systems architecture for industry 4.0-based manufacturing systems. Manufacturing Letters. 2015; 3: 18-23.

67.

Chen

Cheng

Palen

. Mobile – C: a mobile agent platform for mobile C/C++ agents. Software: Practice and Experience. 2006; 36(15): 1711-1733.

An agent-based method for trust graph calculation in resource constrained environments

Abstract

Keywords

1. Introduction

2.1 Industry 4.0 solutions

2.2 Trust management

2.3 Pruning techniques

2.4 Mobile agents and CPS

3. Trust graph calculation

3.1 Fundamentals of the mathematical modeling

Table 1 Double entry table constructed by MaA

Footnotes

Acknowledgments

References

Table 1
Double entry table constructed by MaA