Abstract
Edge computing extends the cloud computing paradigm to the edge of the network to offset the shortcomings of traditional cloud computing in mobile support, low delay, and location awareness. However, traditional complicated encryption algorithms, access control measures, identification protocols, and privacy protection methods are inapplicable to security defense of edge computing given the multisource data fusion characteristics of edge computing, superposition of mobile and the Internet, and resource limitations in storage, computing, and battery capacity of edge terminals. Therefore, a reasonable defense model for complicated dynamic edge computing environment must be established. In accordance with the characteristics of limited resources of edge devices, combined with dynamic game theory, this study proposes an optimal defense strategy model based on a differential game and solves the optimal defense strategy of edge nodes in infinite and finite-horizons. The optimal defense strategy of the edge nodes was simulated under different conditions, and the simulation verifies that the edge nodes can obtain the optimal defense effect with minimum resource consumption when cooperating to form the defense system.
Introduction
With the development of wearable devices, intelligent electric meters, smart city, Internet of Vehicles and large-scale wireless sensor, Internet of Things (IoT) will expand the scale of the traditional IT industry. Cisco conservatively estimates that 50 billion IoT devices will be connected to the network by 2020. The large amount of data generated by these devices will bring great pressure on analysis, storage and communication [1]. Edge computing is an effective solution to the challenge. The core of this model is to use numerous interoperable terminal devices or network edge devices to decentralize numerous communication and computing requirements of the users. The advantage of using terminal devices or network edge devices over cloud computing for task processing is that these devices can effectively reduce latency and extend network coverage. For example, a device can help network operators reduce data traffic by sharing video content directly with their neighbors through direct communication. Edge computing distributes centralized computing resources to the vicinity of the data generation source (called the edge layer) and allocates some computing tasks to the computing resources of the edge layer to improve the real-time performance of the computing tasks [2]. “Edge” refers to any computing and network resources in the path from the data source to the cloud computing center. For example, wearable medical devices can be considered the edge between personal users and cloud computing centers. Gateways in smart homes can be regarded as the edge of in-home electronic devices and cloud computing centers. Cloudlets can also be considered the edge between mobile devices and cloud data center [3].
However, edge computing nodes are frequently deployed in an open and insecure environment and lack effective physical protection. Given the large scale of network systems and applications, edge computing system faces more complex security threats than traditional computing systems. In the edge computing environment, external and internal attackers can attack any entities at any time. Moreover, edge devices are scattered, and the majority of these attacks in the framework of edge computing are frequently limited to within a local environment. Therefore, local edge nodes, such as edge data center, can be responsible for supervising network connections of all nodes, virtual machines, and the surrounding environment. Thus, attacks to service edge nodes are diversified [4, 5]. Without appropriate defense mechanisms, any attacks that form threats may destroy functions of the whole edge node service cluster slowly. Therefore, effective defense models on edge node clusters must be deployed to protect the security of edge user devices effectively. Mtibaa et al. proposed a detection and tracking algorithm based on HoneyBot node in the mobile edge computing environment and recognized and isolated local suspicious malicious nodes through unsafe device-to-device infection communication channels [6]. In 2016, Vassilakis et al. used an aggressive decoy method to protect data. In this method, a data visit is supervised to detect any abnormal access mode. Then, abundant decoy information is fed back to senders upon the suspicion of unauthorized access [7]. Recently, game theory has provided a new idea for studying defense optimization. The application of game theory to the security field of edge computing can establish a mathematical model of conflict of interest between attacker and defender. The cost of different strategies can be weighed on the basis of considering limited resources and whether the defense system can initiate decision-making. The lowest resource consumption must be achieved to reach the optimal level of defense. In [8], the DDoS attack in a mobile edge computing environment is studied. A non-cooperative game model based on Stackelberg is designed to study the defense strategy of mobile users. Sun et al. proposed the method of modeling network attack in a zero-sum multi-objective game [9]. They combined Q-learning and Pareto optimization methods to determine the most harmful attacks and consequently to find the best defense strategy against the attack. Alese et al. presented a location privacy system of n-player game that analyzed the behavior of mobile nodes in the network [10]. Each player attempted to maximize its location privacy at minimum cost by strategically choosing series of actions in the game.
These studies have mainly discussed the defense strategy problem of the edge computing system through a static game or heuristic algorithm. Neither can describe the continuous temporal dynamic features of an actual edge computing system. Many attacks and defense behaviors or state variables cannot be cut in time given the continuously changing environment for edge computing. The optimal strategy at a previous moment may not be optimal in the next. Therefore, a defense strategy shall be formulated in accordance with environmental changes. Time adds dynamics to game, and the unremittance of time makes a dynamic game evolve into a differential game. The differential game combines solution concepts of game theory with control theory to find the optimal strategy for each player. It is related closely with optimal control problems. Players want to control the state of the system in order to achieve their goal [11, 12]. With respect to dynamic characteristics of defense strategy in edge computing, differential game theory [13] is applied to study the defense strategy of edge nodes in the edge computing environment because the differential game is close to time continuity of the defense cooperation system in actual edge computing and beneficial to studying the dynamic defense control of the computing system. A cooperative differential game model of a defense strategy for edge nodes is constructed. The present study has several major contributions. A differential game model for the defense strategy of edge nodes was constructed with full consideration to the dynamic characteristics of edge node deployment in the defense system in an edge computing environment and the balance between reward and energy consumption cost. The optimal defense strategy and payoffs of edge nodes under grand coalition, intermediate coalition, and non-cooperation states were analyzed. The analysis confirmed that edge nodes can only realize the optimal situation which is to gain the highest payoffs with the fewest resources by forming a grand coalition for the cooperative deployment of the defense strategy. This deployment protects the data security of edge computing and decreases the resource consumption of the defense system. Fair and reasonable reward allocation was provided in the defense cooperative coalition formed by edge nodes in accordance with the idea of a Shapley value.
The remainder of this paper is organized as follows. Section 2 establishes a differential game model for the defense strategy of edge nodes. Section 3 solves the model in different time domains and uses the Shapley value to assign the reward. The simulation for the constructed differential game model is analyzed in Section 4. Section 5 concludes the paper.
Defense strategy differential game model
In this study, the balance between the rewards gained by the edge nodes from the deployment of defense measures and the energy consumption cost in the defense system of the edge computing environment was considered. Edge nodes have limited resources in the defense system. These edge nodes can select the quantity of resource consumption independently to deploy defense measures, thus obtaining rewards from the cloud service provider. We suppose that these edge nodes are rational and aim to gain the maximum rewards at the minimum energy consumption cost. Thus, these edge nodes must consume appropriate resources to achieve a compromise between rewards and cost. In this study, a differential game model was constructed for the process used by edge nodes to select the optimal resource consumption in deploying defense measures. N ={ 1, 2, ⋯ n } expresses the set of edge nodes in the defense system, and all edge nodes are participators in the game. r
i
(t) is the defense strategy of edge node i, which expresses the resource consumption of edge node i at t. If x (t) represents the defense ability in the defense system at t, then the variation of resource consumption in the system can be expressed by the following differential equation [14]:
The edge node aims to maximize its payoffs as a participant of the game. The payoff that edge node i gains at t under the resource consumption of r
i
(t) is expressed as g
i
(t), which was determined by the reward of resource deployment defense measure Reward
i
[r
i
(t)] and the cost Cost
i
[r
i
(t)].
According to Reference [15], the relation function between the reward and resource consumption r
i
(t) of an edge node is
Energy consumption will increase as a result of the increase in resource consumption of edge nodes. A linear positive correlation exists between energy cost and resource consumption [16, 17]. In the present study, the energy consumption cost for resource consumption of an edge node was set to ɛ
i
r
i
(t), where ɛ
i
is the energy cost factor per unit resource. Thus, an edge node might reduce resource consumption as much as possible to decrease energy consumption loss. Furthermore, rewards gained by an edge node are also affected by the overall defense level of a defense system. If the system has a high defense level, then payoffs of edge nodes are influenced. If the security defense requirements of a cloud service provider are fixed, then the reward is also fixed. If excessive resource consumptions are observed in the system, then such amount will lessen rewards per unit resources, thus lowering payoffs of the edge node. The strength of an attacker can also influence the payoffs of the defense system. A high attacking strength denotes low system payoffs. Therefore, resources that are consumed by other edge nodes in the defense system and the loss cost caused by influences of attacking strength (χ
v
) of the attacker v on edge node i are expressed as χ
v
α [x (t) - r
i
(t)], where α
i
is the influencing factor of system payoffs. In summary, the cost loss of edge node i is:
In a game, edge node deploys defense measures by consuming resources, thereby finally realizing the maximum payoffs. However, reward and cost are contradictory. Game players must select appropriate defense strategies to realize the balance between reward and cost. In addition, all edge nodes that participate in the game may influence payoffs of edge node i, and a dynamic optimization problem can be constructed to the defense strategy of an edge node, in which λ is the discount rate.
s.t. (1).
The optimal defense strategies of edge nodes using grand coalition, intermediate coalition, and non-cooperation were solved on the basis of the optimization problem. The optimal resource consumption of each edge node is calculated to reduce the energy cost to the maximum extent and achieve the maximum payoffs. In addition, payoffs of edge nodes in grand coalition, intermediate coalition, and non-cooperation were compared by calculating the characteristic functional value of the game. The total payoffs were allocated to the edge nodes involved in the cooperation using the Shapley value. To make conclusions universal, game results were analyzed in finite and infinite-horizons.
Infinite-horizon model solution
Suppose the edge node does not know the stage at which the attack will end. This instance was viewed as a differential game in an infinite-horizon in the present study. The game time was t ∈ [0, ∞). Then, the optimal defense strategies of the edge node in non-cooperation, grand coalition, and intermediate coalition in an infinite-horizon were solved.
Feedback Nash equilibrium in the infinite-horizon
Rational and selfish edge node pursues maximum individual payoffs in a non-cooperative game. Thus, an objective function could be constructed for the defense strategy optimization problem of an edge node in the infinite-horizon.
s.t. (1).
A feedback Nash equilibrium solution must be determined in the non-cooperative game model, which is used as the optimal control of an edge node. Then, the feedback Nash equilibrium is solved in accordance with the Bellman equation.
The strategy set
Thus, the solution is expressed as follows.
Here, the supposition is defined as follows.
Therefore,
then,
Accordingly, the feedback Nash equilibrium solutions when edge nodes deploy defense strategies independently (non-cooperation) in the infinite-horizon can be calculated as
All edge nodes can form a grand coalition in the deployment of a defense strategy. In this case, all edge nodes cooperate to maximize the total payoffs of the coalition to protect the optimal performance of the coalition. Thus, the optimal control problem of a grand coalition is
s.t. (1).
Grand coalition solves the standard dynamic planning problem to maximize payoffs of all gamers under the constraint of the dynamic system model. The control set
If
The optimal defense strategy when all edge nodes form a grand coalition is
The game equilibrium state is integrated into Equation (1).
The optimal trajectory of system defense ability under cooperation can be obtained by solving the abovementioned differential equation.
When K edge nodes cooperate in the deployment of defense measures, a value function W (K, x, t) was defined for each coalition K (|K| < N) to satisfy the Bellman equation.
The equation can be calculated through the solving method of a grand coalition.
If an edge node knows when the malicious attacking behavior will end, then this situation can be considered a differential game in a finite-horizon. The optimal defense strategies under non-cooperation, grand coalition, and intermediate coalition of edge nodes in a finite-horizon could be analyzed in the following text. The game time was set tot ∈ [t0, T].
Feedback Nash equilibrium solution in the finite-horizon
Under non-cooperative feedback Nash equilibrium, the defense strategy of the edge node is only determined by the current moment and state but is unrelated to memory, including the initial state. The payoffs of the edge node i after every moment of the game are higher than or equal to the payoffs when it deviates from the feedback Nash equilibrium independently. The payoff of the edge node in the last moment is equal to the terminal payoff. Therefore, the control of an edge node is {u i (t) = v i (t, x) , i∈ N }. The optimal control set of each edge node is the feedback Nash equilibrium of the game.
Similarly, the feedback Nash equilibrium of the non-cooperation game model in the finite-horizon was solved using the Bellman equation, which was used as the optimal defense strategy of the edge node.
The control set
If
Equation (27) can be rewritten as follows at T:
The combination of Equations (25) and (28) yields
An intermediate derivative of V
i
(t, x) in t can be gained from Equation (27).
Moreover, the intermediate derivative of V
i
(t, x) in x is gained from Equation (27)
Equations (26) and (31) are integrated into Equation (24). Thus,
Equations (32) and (30) are combined to obtain
To make Equation (33) true, the following equation must be satisfied:
This equation is solved as
In particular, the edge node can gain the highest payoffs when it consumes
When all edge nodes form a grand coalition in the deployment of defense measures, the optimal control problem of the grand coalition in the finite-horizon is
s.t. (1).
The following Bellman equation is satisfied.
The intermediate derivative of r
i
(t) is calculated using Equation (37) and set to 0, thus obtaining
With references to the solving method of the feedback Nash equilibrium solution in the finite-horizon, the solution can obtain
When the edge node i (i ∈ N) deploys the defense measure in the grand coalition at the resource consumption of
The objective function of the optimal control problem of the intermediate coalition in a finite-horizon shall be
s.t. (1).
The following Bellman equation must be satisfied:
In accordance with the previous solving method, the optimal solution of the intermediate coalition of edge nodes is
In the cooperative defense coalition, different edge nodes have diverse defense abilities and make various degrees of contributions to the system. Therefore, equal distribution of payoffs of the whole coalition may reduce the willingness of edge nodes to deploy defense measures by consuming their own resources. Shapley value is an effective method for allocating payoffs. It distributes cooperative payoffs fairly in accordance with the contributions of each participant [19]. In the present study, the Shapley value can be considered the average marginal contribution ratio of each edge node in every possible resource cooperative coalition in the game. Thus, the Shapley value is unique and is frequently applied as the distribution mechanism of a cooperative differential game.
In accordance with the method for calculating the characteristic functional value in Reference [20], the characteristic functional value in the cooperative game can be gained.
Therefore, the expression of Shapley value in the cooperation game is
The Shapley value allocates total payoffs in accordance with the marginal contribution of edge nodes. Furthermore, the Shapley value can reflect the characteristics of fairness, and all edge nodes obtain high payoffs through cooperation. Thus, such payoff allocation scheme is can be accepted by all edge nodes easily.
In this section, we conducted simulation experiments on the proposed differential game model of defense strategy. In the experiment, five edge nodes were selected as experimental nodes to form a defense system. Parameter settings of these edge nodes are displayed in Table 1. In accordance with the practical situation in the defense measure deployment of edge nodes, the optimal defense strategy and payoffs of marginal nodes in finite and infinite-horizons were analyzed.
Simulation parameters
Simulation parameters
In this section, the optimal defense strategy of edge nodes under different situations was simulated and analyzed. First, the optimal resource consumptions of edge nodes s1, s2, s3, s4, s5 were analyzed. In this simulation,T = 20, q = 1.5. These edge nodes were under independent deployment of defense measures at different moments in the finite-horizon with various attacking strengths. The results are illustrated in Fig. 1. The optimal defense strategies for different edge nodes vary at diverse moments. In the initial and middle stages of the game, edge nodes require numerous resources to realize the optimal defense effect when the influencing factor of system payoffs α i is small. In the late stage and final moment of the game, the optimal resource consumption is positively related with α i . Such a positive correlation becomes evident with time. This agrees with the fact. The influencing factor of payoffs represents the impact degree of external adverse factors on edge nodes. A small α i indicates a strong interference resistance of edge nodes. Accordingly, the edge nodes easily reach the optimal defense effect with the small resource consumption.

Variation trend of the optimal resource consumption of edge nodes in the finite-horizon with time.
The optimal resource consumptions of s1, s2, s3, s4, s5 under independent deployment of the defense measures at different moments in the infinite-horizon when various attacking strengths exist are depicted in Fig. 2. The optimal defense strategy of edge nodes is constant and unrelated with time.

Variation trend of the optimal resource consumption of edge nodes in the infinite-horizon with time.
The optimal defense strategies of s1, s2, s3, s4, s5 under different attacking strengths (χ v ) in the finite-horizon and non-cooperation conditions are demonstrated in Fig. 3. The results at t = 5 were analyzed. Figure 3 exhibits that the optimal resource consumption of edge nodes declines with the increase in attacking strength. This result was due to, if edge nodes deploy defense measures against the increase in attacking strength with their own resources, then affecting the execution of tasks is easy given the limited resources. Thus, edge nodes shall select few resources to deploy defense measures upon strong attacks, thus protecting the stability and reliability of the task performance and maximizing payoffs.

Effects of attacking strength on the optimal resource consumption in the finite-horizon.
The variation trend of the optimal defense strategy of edge nodes under the same attacking strength with parameters is displayed in Fig. 4. The optimal defense strategy of edge nodes increases gradually with time but decreases with the increase in α i . This result demonstrates that edge nodes must determine the appropriate resource consumption in accordance with the influences of external adverse factors in the deployment of defense measures.

Variation trend of the optimal resource consumption with time and the influencing factor of system payoffs.
In this section, the effects of different parameters (α i , ɛ i , and χ v ) on payoffs of edge nodes under grand coalition, intermediate coalition, and non-cooperation were analyzed. Suppose five edge nodes (N = 5) are available, and the time domain is T =∞. The influences of attacking strength on payoffs of edge node under grand coalition, intermediate coalition, and non-cooperation are illustrated in Fig. 5. Given the same attacking strength, the highest payoffs of the edge node are achieved under the grand coalition, and the lowest is realized under non-cooperation. The results suggest that edge nodes form a coalition with other nodes to deploy defense measures. Grand coalition can reduce the optimal resource consumption and gain higher payoffs than those under non-cooperation. In particular, the cooperation game can realize an improved balance between reward and cost.

Effects of attacking strength on payoffs of edge nodes under different coalition modes.
The variation trend of maximum payoffs with α i and cost factor per unit resource consumption (ɛ i ) under grand coalition is plotted in Fig. 6. Edge nodes must consider energy consumption and rewards comprehensively during the deployment of defense measures to adjust their resource consumption dynamically and achieve maximum individual payoffs.

Effects of the influencing factor of system payoffs and cost factor per unit resource consumption on payoffs of edge nodes.
A defense strategy model for edge computing based on differential game theory is proposed in this study to realize the balance between reward and energy consumption cost of edge nodes in the deployment of defense measures. This model analyzes the optimal defense strategy and payoffs of edge nodes under grand coalition, intermediate coalition, and non-cooperation. Moreover, the payoffs of a coalition are allocated fairly and reasonably on the basis of Shaply value. The simulation results show that edge nodes can achieve the optimal situation in the case where a grand coalition is formed for the cooperative deployment of defense measures, that is, consuming the least amount of resources to obtain the highest payoff.
