Abstract
In this paper, the prediction of damage results for complex network is considered under grey information attack. Firstly, in order to construct more realistic networks, a new algorithm is proposed to generate 3 types of fully connected networks (normal scale-free network, scale-free network with cutoff, random network). Secondly, robustness of the 3 networks is analyzed under grey information attack. And then, a new method is proposed to predict the damage results by training the BP neural network. Thirdly, the effects of different topological parameters on the damage results are analyzed and a new method is proposed to find central nodes of the network. Finally, the damage results of a real bus network under grey information attack are predicted by the proposed method and several suggestions are given to help protect the real bus network.
Keywords
Introduction
Nowadays, we are in a world consisting of various networks, the stability and robustness of these networks are becoming paramount and interesting in science and engineering when networks are under attacked. Ref. [1] shows the huge differences in robustness between intentional attack and random attack. Ref. [2–4] resolves the size of giant component and value of critical removal fraction using generating function method based on percolation theory. Ref. [5] studies the North American power grid in view of network and analyzes its invulnerability.
In general, most attack strategies are the random attacks or intentional attacks. If the attackers know nothing about the network topology, they used to randomly attack a number of nodes or edges. If the attackers master the network topology information accurately, they can choose certain nodes or edges as their targets. While, for real networks, the information attackers knew maybe not precise or complete, based on this idea, grey information attacks were proposed. That is grey information attacks happen when attackers know something but not everything about network topology. The intentional attacks and the random attacks are two special ways for grey information attacks, so they should be mainly considered. Ref. [6, 7] study the robustness of the network with grey information attack and find that different information precision for attacking networks will lead to different damage results of the networks. Based on Ref. [6–8] proposes a new grey information attack strategy, which provides insight into the investigation of attack and defense strategies of complex networks. Although great efforts have been devoted to analyzing the network robustness under grey information attacks, there are still some open problems: How will the network change with different information precision for attacking networks? Can we predict the damage results of the network based on the experience under grey information attack?
As one of the hot fields in complex network, prediction problems attracted widespread attention recently. One of the prediction methods uses the known information to predict the unknown information. The other uses the past data to predict the future trends of the network and most of the predictions focus on network evolution [9] and link prediction [10]. Ref. [11] predicts the topology of the network by proposing an evolving network model. Ref. [12] studies a variety of techniques for link prediction in social network. Ref. [13] considers the link prediction for evolution process of different networks including brain network, protein network, ecological network and social network. In addition, Ref. [14] predicts the network security situation based on the defined parameter of the network security. However, the predictions for network robustness are less concerned. In fact, when some nodes break down, prediction provides a fast way to evaluate the final damage results of the network and can even avoid further damage by promoting assistances. Furthermore, in order to reduce the conceivable attack damages, the designers can find the key stations by prediction then make a more robust design for these key nodes. Ref. [15] makes predictions for the scale-free network under random attack, and the relative size of the giant component is predicted. However, one parameter is not enough to evaluate the network robustness, it is necessary to take more measure parameters to represent various impacts on the network.
Considering the prediction of complex network robustness, it is necessary to construct more realistic network models. In earlier researches, compared with regular networks, random networks represent the real networks better. However, recent studies find that most of these real networks are not completely random. These networks have scale-free property and high network connectivity. Ref. [16] proposes a network growing algorithm with preferential attachment to generate fully connected networks, but the degree distribution of the generated networks should always obey power law form with the exponent 3. According to the weight assigned to each node, Ref. [17] adds edges in the network to generate scale-free network with changeable exponent. However, the generated networks may not be fully connected and do not have the other non-power-law characteristics as well.
In this paper, based on grey information attack, damage result predictions for different types of network are considered. The contributions of this paper are threefold. Firstly, in order to represent the connectivity of real networks, a method to generate fully connected networks randomly is proposed. Comparing with the previous work [16, 17], the proposed method can generate networks with different degree distribution. Secondly, robustness of 3 types of network under grey information attack is analyzed and a new method is developed to predict the damage results based on the simulation results. Finally, a new method to find the central nodes in a network is provided. In addition, several suggestions are given to the city planners which will help them to protect the real bus network.
Complex network model under grey information attack
Generation of fully connected network
In order to construct more realistic networks, random network and scale-free network are considered in this paper. Most network generation algorithms can only generate scale-free network with fixed exponent or non-fully connected network [16–21]. For normal scale free network, the degree distribution is described by P (k) ∼ k−γ, here P (k) is the probability that nodes with degree k exist, γ is the exponent of the scale-free network. However, it shows that nodes with large degree in many networks are cost-prohibitive [22]. So the degree distributions of these networks are described by P (k) ∼ k−γe−(k/-κ), here κ is the cutoff. When k is greater than κ, it becomes very costly to add edges between nodes [22]. So, from Ref. [23], we randomly get two degree sequences subject to two different degree distributions respectively. For normal scale free network, the degree of node v can be calculated by
and the probability that accepts the generated k v is p = (k v / − kmin) −γ.

Flow chart to generate fully connected network.
Through configuration model (CM) [24], the method by which randomly generate the network with given degree sequences is shown in Fig. 1 and the detailed explanations for each step is as follows.
For random network, because of its uniform distribution of the degree, with the same average degree, the connectivity of the random network is worse than the scale free network. When the original network with small average degree is randomly generated through CM, random edge swap cannot make the original network fully connected. So a random network with large average degree is generated by randomly adding W edges between N nodes, where W = k · N, k is the average degree of the network. The new generated network does not have loops or multiple edges and connects well, then verify if the generated network is fully connected, if not, regenerate the network until it becomes fully connected.
For grey information attack, observed value and information precision mastered by attackers are crucial. In this paper, observed value means the degree of nodes mastered by attackers, whose accuracy depends on the information precision. In undirected connected network G = (V, E), V is the node set, E is the edge set. The real degree and the observed degree of node v are d
v
and
where δ is a uniform distribution variable in the interval [0, 1], α ∈ [0, 1] is the information precision. When α = 0,
Analysis of network robustness
For the 3 generated fully connected networks (normal scale-free network, scale-free network with cutoff and random network), nodes of the 3 networks according to the observed degree value and deduce the damage results of the network in different attack strategies (random attack, intentional attack and grey information attack). 3 parameters are selected to measure the network robustness.
(1) Relative size of the giant component S is defined as:
(2) The average path length of the giant component L is defined as
(3) Network performance parameter E is defined as
where d
ij
is the shortest distance between nodes i and j in the initial network. Furthermore, following the idea in [26], an overall parameter to measure the network robustness is proposed by combing the above three parameters linearly,
Next, several simulations are given to analyze the robustness of 3 types of networks. There are 1000 nodes in each network. Most of the real networks obey power law form with the exponent between interval [2, 3]. Therefore, the exponent of both normal scale-free network and scale-free network with cutoff is chosen as 2.5. Based on grey information, different proportions of nodes are attacked intentionally according to the decreasing order of degree and record average results of 200 simulations.
Figure 2 shows the change of robustness measurement parameters in scale-free network under grey information attack. f represents the proportion of broken nodes when the network is attacked. As shown in Fig. 2(a) and (c), nodes with larger degree tend to fail with an increasing α, and the damage results of the network become more serious. When α = 1, the nodes will be broken down sequentially according to the decreasing order of degree. In this case, when 10% of the nodes are broken down, network collapses completely. When α = 0, it means that attacks happen randomly, when f = 10%, S and L change a little, which indicates f influence the network weakly at this time. However, in Fig. 2(b), L increases at first except the case α = 1, that indicates f have greater impact on network topology than network connectivity initially. When L reaches to its peak, f have the strongest impacts on network connectivity. For α = 1, L decreases all the time, it means nodes with larger degree have a significant impact on both network connectivity and network topology.

Change of robustness measurement parameters in scale-free network under grey information attack. (a) S. (b) L. (c) E. (d) T.

Change of robustness measurement parameters in scale-free network with cutoff under grey information attack. (a) S. (b) L. (c) E. (d) T.
Figure 3 shows the change of robustness measurement parameters in scale-free network with cutoff. As shown in Fig. 3(a–d), when α = 1, the parameters are consistent with normal scale-free network. But when α ≠ 1, damage results of scale-free network with cutoff are always heavier than normal scale-free network. This is because comparing with normal scale-free network, scale-free network with cutoff has fewer nodes with large degree, which indicates for scale-free network, the more nodes with large degree there are, the more robust the network is under non-intentional attack. Furthermore, different from normal scale-free network, in scale-free network with cutoff, when α > 0.6, the change of parameters under grey information attack is similar to intentional attack. This is because when the maximum degree of network is small, by Equation (3), bigger α causes smaller random distribution range of
As shown in Fig. 4(a–d), for each α, robustness measurement parameters change a little. Particularly in Fig. 4(b), when α = 1, L does not decline directly, but rises at first and declines after reaching its peak. It indicates that, in random network, the degree distribution of nodes is more uniform, and there are not any nodes with particular large degree which can influence network robustness remarkably.

Change of robustness measure parameters in random network under grey information attack. (a) S. (b) L. (c) E. (d) T.
Under grey information attack, there is significant difference between scale-free network and random network. For scale-free network, if there are more nodes with large degree, the damage results will be lightened. For random network, there is not any node with large degree.
Neural networks can approach any non-linear function by training. So in this paper, we will use BP neural network to predict the damage results. In input layer, 25 parameters which can reflect the number and topology of broken nodes are considered as inputs. They are showed in Table 1. 3 parameters which can reflect damage results of the network are used as the outputs. They are S, L and E. The number of middle layer neurons is determined by the following formula.
The data of simulation results in Subsection 3.1 is used to train the neural network. The nonlinear relationship of each network can be gotten by training the corresponding network. The 25 input parameters are normalized with the following formula.
Weight contribution rate of topological parameters of broken nodes in different network (%)
where y
i
and
For the prediction of normal scale-free network, the minimum, median and mode value of clustering coefficient in each set of data are always the same, so these 3 input parameters in the prediction model are not considered. As shown in Fig. 5, by BP neural network prediction, the real damage results and predicted damage results under grey information attack with different information precision α and the proportion of broken nodes f are compared, the left figures are based on real values of S, L and E, and the right figures are based on predicted values of S, L and E. These figures demonstrate small differences between real values and predicted values. And by Equations (10 and 11), the average prediction accuracy of S, L, E is 94.18%, 85.54%, and 96.41% respectively. As shown in Fig. 5(d), the average prediction accuracy of T is 94.18%. So the predictions are accurate.

Change of network damage results between real and predicted value in normal scale-free network under grey information attack. (a) S. (b) L. (c) E. (d) T.
For the scale-free network with cutoff, the minimum, median and mode value of clustering coefficient, and maximum value of coreness in each set of data are always the same, so these 4 input parameters are not considered in the prediction model. As shown in Fig. 6(a–c), by BP neural network prediction, the damage results under grey information attack with different information precision α and the proportion of broken nodes f are compared. The average prediction accuracy of S, L, E and T is 90.20%, 84.42%, 91.02% , 91.24% respectively, the predictions are also accurate.

Change of network damage results between real and predicted value in scale-free network with cutoff under grey information attack. (a) S. (b) L. (c) E. (d) T.
For the random network, besides the minimum and mode value of clustering coefficient, the maximum, median and mode value of coreness are always the same, so these 5 input parameters are not considered. As shown in Fig. 7(a–c), by BP neural network prediction, the damage results under grey information attack with different information precision α and the proportion of broken nodes f are compared. The average prediction accuracy of S, L, E and T is 96.64%, 88.91%, 98.04% and 97.45% respectively. It is the most accurate prediction among the 3 networks.

Change of network damage results between real and predicted value in random network under grey information attack. (a) S. (b) L. (c) E. (d) T.
By training the neural networks, we can predict the damage results for different networks. So, we can assess damage results of the network caused by any attacks in advance. Next, we will find the main parameters influencing the predicted results in order to estimate the central nodes which have considerable impact on network robustness.
In this paper, the weight contribution rate is introduced to analyze the various influence factors which include the number parameter and different topological parameters of broken nodes. The weight contribution rate is defined as the ratio of various input parameters to the output parameters in BP neural network, which can be calculated as follows
Under grey information attack, Table 1 shows the ratio of 25 factors of broken nodes to the 3 robustness measurement parameters.
As shown in Table 1, for different networks, S, L, E can be influenced differently by various factors of broken nodes. Next, we will consider the effect of various factors on the whole robustness measurement parameters T. The effect on T can be calculated by (7). For normal scale-free network, the top five factors which have the greatest impact on the network robustness are average degree, betweenness variance, degree variance, median betweenness and average betweenness. While in scale-free network with cutoff, the top five factors that influence robustness the most are average degree, betweenness variance, average betweenness, degree variance and the number of broken nodes. But in random network, they are the number of broken nodes, average degree, average betweenness, degree variance and average coreness.
For normal scale-free network, the number of broken nodes would not be the main factor affects the network robustness, there are some central nodes, which are determined by degree and betweenness of nodes and have great influence on the robustness. But for scale-free network with cutoff, the number of broken nodes is the main factor affecting the robustness of the network, the impact of central nodes is less. For random network, the most important factor affecting the network robustness is the number of broken nodes.
In this section, we take Shenyang (one of the biggest cities in China) bus network (data from http://shenyang.8684.cn/) as an example. Firstly, a bus network model is constructed, the bus stops are chosen as nodes and an edge exists between two nodes if they are consecutive stops on any bus line [27]. Then, the robustness of this real network under grey information attack is analyzed, which is shown in Fig. 8.
Comparing Fig. 8 with Fig. 3, under grey information attack, there are many similarities for S, L, E and T between Shenyang bus network and scale-free network with cutoff. The damage results of the two networks are the same when α is small. This is because the degree distribution of Shenyang bus network are the same as scale-free network with cutoff, which means that Shenyang bus network does not have the nodes with large degree [28].

Change of robustness measure parameters in Shenyang bus network under grey information attack. (a) S, (b) L, (c) E, (d) T.
While when α > 0.6, the damage results of Shenyang bus network are less serious than scale-free network with cutoff. Because the degree of Shenyang bus network is smaller than the scale-free network with cutoff, the impact is lighter when the nodes with large degree are removed. In a word, it can be inferred that Shenyang bus network has many similarities with scale-free network with cutoff, but they are not the same.
Next, the damage results by training BP neural network are predicted. Under grey information attack, the comparisons of real value with predicted value are shown in Fig. 9. The mode of clustering coefficient in each set of data is always the same, so the mode of clustering coefficient is not considered in the prediction. The average prediction accuracy rate of S, L, E are 88.98%, 80.89% and 91.68% respectively. And its average prediction accuracy rate is 90.54%.

Change of network damage results between real and predicted value in Shenyang bus network under grey information attack. (a) S. (b) L. (c) E. (d) T.
Then different contribution rate of broken nodes on 3 measure parameters and T for Shenyang bus network are shown in Table 2.
Different topological parameters of broken nodes to contribution rate of Shenyang bus network robustness (%)
As shown in Table 2, in Shenyang bus network, the top five factors of broken nodes that sharply influence the network robustness are the number of broken nodes, average degree, degree variance, average coreness and average betweenness. The number of broken nodes is the primary parameter in Shenyang bus network, and the central nodes are determined by degree, coreness and betweenness of nodes. Compared with Table 1, it is easy to find that the top five factors of Shenyang bus network are close to random network. The most influential factor affecting the network robustness is the number of broken nodes. Meanwhile, other 4 influential topological parameters are the same as random network. So, it is inferred that Shenyang bus network is not only similar to scale-free network with cutoff, but also has several similarities with random network.
By analyzing the main factors on T in Shenyang bus network, some suggestions are proposed to ensure the bus network work well. For one thing, a large number of non-central nodes should be avoid breaking down. This is because the number of broken nodes is the most important factor that affects network robustness. For another thing, the stops passed by more bus lines should be protected because degree is the main parameter for central nodes.
In this paper, firstly, a new way is proposed to randomly generate fully connected networks and analyzed the robustness of 3 networks under grey information attack. Secondly, based on different information precision, the network damage results of the 3 kinds of complex network are simulated. Thirdly, by training the BP neural network, the damage results can be predicted for other information precision. The topological parameters of broken nodes that influence the robustness of different network are also analyzed. This paper found that the central nodes can be determined by the degree and betweenness of nodes in normal scale-free network and scale-free network with cutoff. But for random network, the central nodes are determined by the nodes coreness. Furthermore, various types of networks are affected differently by central nodes. Finally, the damage results of Shenyang bus network under grey information attack were predicted. And it is found that the central nodes are determined by the degree, betweeness and coreness. Also, several suggestions are given to help protect the real bus network.
Footnotes
Acknowledgments
This work is partially supported by the National Natural Science Foundation of China (61473073, 61104074), Fundamental Research Funds for the Central Universities (N170417006) and Program for Liaoning Excellent Talents in University (LJQ2014028).
