Network security situation prediction in the cloud environment based on grey neural network 1

Abstract

Existing network security prediction methods for the cloud environment are limited in terms of both accuracy and real-time performance. In this paper, we address these issues with a proposal for a method based on grey neural network to predict network security situations in cloud environments. First, we explore security factors for network security situation awareness based on classification and fusion techniques in order to generate awareness indexes. Through this, we establish a hierarchical index system for network security situation. Then, a method is elaborated that combines grey theory and neural networks to predict network security situations by analyzing the features of grey and neural networks that combine high accuracy and real-time performance. Finally, through experiments with simulated data, a network prediction algorithm for security situations is verified. Results of experiments show that the method is both correct and feasible.

Keywords

Network security situation assessment network security situation prediction grey theory grey neural network model

1. Introduce

With the development of communication technology and cloud computer technology, the application of computer network becomes more and more extensive in every field of daily life. At the same time, network security incidents caused by attack or destruction maliciously also become more and more common and the security issues have become increasingly prominent. Especially, undesirable security vulnerabilities and security incidents will increase greatly. Thus, network is facing a serious security problem. Traditional equipments with limited protection functions such as IDS (Intrusion Detection Systems), firewall, security scanners, etc., have work in an independent or semi-independent state generally. Because there is little connection between each other about communication for information, it can not meet the needs of network security. How to master the state of whole network security comprehensively promptly is an important problem to be faced. There is no common cognition (an important understanding for mastering the whole network state) for protecting network resources, and this cognitive disconnect will greatly delay the administrator to make judgment or to deal with the threat in best time. Moreover, it can help management as soon as possible to have a clear understanding of the network situation, which is expected further more to solve the problems of network security.

Situation Awareness (SA) was derived from the study of space flight. Since then, in the fields of military battlefield, nuclear reaction control system, air traffic control and medical emergency dispatch and so on, this technology has also been used widely. The concept of Cyberspace Situation Awareness (CSA) was proposed for the first time by Tim Bass in 1999 [1]. He pointed out that the network situation awareness based on fusion will become a development direction of network management. Situation emphasizes the connection between environment and entity, which is a dynamic state, a trend, a whole and global concept. Any single case or state can not be called situation. Situation awareness has become a hot research topic at present. Because in a dynamic and complex network environment, decision makers can use situation awareness tools to master a global state change, in order to make right decisions.

At present, network security situation awareness does not have a unified, comprehensive definition. By extracting the related factors of network and security, it can analyze and understand the whole network security state, and predict the future development of network security situation with the method of network security situation awareness. Its research focuses on the efficient organization and overall awareness for all complex information, whose purpose is to shorten the time from obtaining information to make decisions. It mainly applies information fusion technology that takes advantage of intrusion detection log, firewall log, virus log, network scanning data, illegal external link data, and running state of equipment and real-time alarm information, multi-source heterogeneous observational data.

Network security situation awareness need to make an accurate strategy for assessment, prediction. It can protect network healthy to run, which has become a very important factor to affect social and economic development.

Traditional network security situation awareness provides a lot of analysis convenience for security administrator, but most of them are based on the analysis of system logs. There exist many problems that data source is single, real-time performance is poor, and awareness results are too dependent on an experience of network management personnel. At present, the research direction of network security has changed from passive security system construction to intrusion detection, defense attack. Situation awareness is an active security system construction. Thus, research on security situation of global network has been transferred to a single security problem. But the security situation awareness is still in a start stage at home or abroad, the relevant technical theory has not be mature, so it is urgent to find an effective and accurate awareness method.

Network security situation prediction is a general prediction of the future development trend of network security, and a useful supplement to network security situation awareness. At present, traditional prediction method of network security situation is mainly lack of real-time. Existing network security situation prediction methods can not accurately reflect the changes of future security situation [2]. Moreover, it is not very good to deal with the relationship between security elements and future network security situation. In this paper, a prediction method of network security situation based on temporal and spatial analysis is proposed.

Concerning the problem that existing network security situation prediction in the cloud environment has low accuracy and real-time, a method based on grey neural network to predict network security situation in the cloud environment is proposed, which is fast and efficient, and can be fused with mul-levels of multi-source and heterogeneous information. In this paper, a method combining grey theory with neural network to predict network security situation is used to analyze the features of grey and neural network, holding high accuracy and real-time, that can show a macro view, strengthen to understand the network development. Moreover, it can yet provide a reliable reference easily for understanding future network security and provide a reliable decision support for administrators.

2. Related work

Network security situation awareness can represent a whole network from macro, overall and current running state that can also predict next phase of network security state. Research on network security situation awareness includes two aspects mainly: situation awareness and situation prediction. Among them, before network security situation prediction, it must collect a lot of network security elements in some time or space. Generally, we first obtain the data of relevant security situation factors, to be used as classification and integration in order to get an overall network security state. Then, we can analysis and predict its future development trend through numerical and graphic forms. Network security situation awareness and prediction can provide a reliable basis for network administrators to make policy for network security. About network security situation prediction, researchers have put forward a large number of prediction methods and models from multi-angle and multi-dimension prospectively.

In many countries, research on network security situation awareness, mainly accesses to distribute in heterogeneous information network in various types of safety equipment using different data fusion technology. By establishing multiple awareness or prediction models, it can achieve the purpose of network security situation awareness. Representatively, Bass established a network model for intrusion detection system with heterogeneous information to assess current network security situation. Cristina’s Abad [3] uses Unclog+ to design a system for network security situation awareness. Through information collection for network security events of security sensor and correlation analysis, it can fuse this information to gain network security situation.

At present, many valuable research work has been carried out about the network security situation awareness and prediction method [4, 5], obtaining certain achievements, which will guide these research work in future. A new method of network security situation awareness based on neural network is proposed, which aims at the problem of time and spatial complexity for network security situation awareness in paper [6]. In paper [7], a real time HMM-NSSP model based on HMM model is proposed from a combination of theory and practice. HMM-NSSP model is used to establish a HMM predication model, and the parameters of network nodes are dynamically modified in order to monitor and predict the whole network security situation in time. In the network security situation awareness based on Hidden Markov model, it improves the method establishment with sequence acquisition and transfer matrix [8]. The risk value of improved algorithm is more reasonable to the network security situation.

Pu [9] uses a gray theory model GM (1,1) to predict network security situation through mining the related information from random time change series of network security situation. Wei et al. [10] puts a time series analysis and prediction method applied to the prediction for network security situation, whose principle is to predict future development according to the change trend of historical data series itself. Hu et al. [11] uses Bayesian network to represent the graph model of network security events, describing connection probability that the probability of security occurrence at next moment is derived by using this model. Wang [12] predicts network security situation using SVM model that holds the advantages of fast convergence speed.

From above prediction methods, there is some common shortcoming that it considers network security situation itself or security situation sequence variation only, without taking into accounts the other security factors. The insufficient information will cause that the prediction result and actual result have very big difference, so the prediction precision is not high. Concerning above problems, this paper proposes a method based on grey neural network to predict network security situation. This method takes full use of every security factor of network security situation, combined with grey model GM (1,1) and multi-factors.

3. Network security situation awareness

3.1 Gaining situation factors

It is very important to construct a reasonable index system of security situation before awareness or prediction for network security situation. Different security factors, different quantitative methods, different network weighted methods, different fusion methods and awareness models will result in different network security situation that may influence the accuracy of awareness or prediction.

Network security situation can be determined by a variety of situation factors, including such as hacker attacks, viruses, Trojans, and malicious code, that they can determine network security situation. Due to network holding complexity, device diversity, network security incidents diversity, it is necessary to take into account the network security elements from multi-angle information [13]. Note that situation factor is a subset of security factors.

The fusion information of network security, taking full into account all network aspects, mainly comes from network key nodes, including firewall, intrusion detection system and all kinds of scanning system. Network data flow information used for information fusion includes such as packet size and distribution, packet loss rate, data flow and flow change ratio, protocol distribution, data flows, outflow rate of growth, IP address of data source, distribution, and subnet bandwidth usage ratio. Alarm information from security equipments includes such as security incidents, Trojans, viruses and other malicious attacks frequency. The network vulnerability includes such as the number of network vulnerabilities, and the number of key equipment vulnerabilities. Security device configuration information includes such as the number of open ports, the number of key devices, the number of security devices in the sub network, the number of anti-virus software.

Generally, the original heterogeneous data collected from network are complex, diverse, irrelevant, false and redundant. Therefore, according to certain rules, we may use mathematical methods to extract network security information, in order to obtain the main indexes of network security situation. The extraction method used in this paper is mainly through correlation analysis, rule matching in order to eliminate the redundant, false, irrelevant security factors.

There are a number of network equipments in network, but the relationship between various devices is complex. Thus, the method for direct analysis data is not desirable. So, it is necessary to deal with the collected data in order that they can be converted into the situation awareness indexes before situation awareness or prediction. The first step for network security situation awareness is to obtain original security factors. Due to amount information, it needs to carry on a comprehensive quantitative classification processing, in order that every factor may be selected as a situation awareness index. Security factors for quantitative calculation method mainly refer to: change rate of network traffic, packet loss rate of data streams, vulnerabilities, and security equipment quantity statistics, malicious code, security event Trojan virus etc. After quantitative process for network security factors, all factors may be classified into various indexes for situation awareness. An integration model for network security factors is shown as Fig. 1.

Figure 1.

Integrated model of network security situation factors.

3.2 Awareness index architecture

After eliminating redundancy, quantitative processing, fusion classification, the awareness indexes can be used to assess network security situation. This paper selects different levels (nodes, hosts, and the whole network), different information sources (host running state, flow, alarm, log, and asset allocation information) and different services according to the hierarchical and structural network system in order to assess network security situation. We will describe network security situation and its change regularities quantitatively through all the awareness indexes in this paper. The hierarchical architecture for awareness indexes is shown as Fig. 2.

Figure 2.

Hierarchical architecture of network security situation awareness index.

The architecture can be divided into 3 levels from bottom to up shown in Fig. 2. In fact, network security situation prediction is a process of data fusion. From part to whole, it can be gradually converged into the overall network security situation index. The main macro indexes are composed of three kinds, which include: network basic running index, network vulnerability index and network threat index. Each of second level indexes is weighted by the third level indexes. The basic running index shows the value of network internal running state itself, the vulnerability index shows the network potential value, and the threat index shows the value of network attack. From Fig. 2, network security situation is in the first level, based on the basic running index, vulnerability index, threat index that can construct the three dimensions for index. Each of the three level indices is weighted from the relevant bottom data. The bottom layer of data source is composed of many security factors that affect network security situation greatly, mainly including certain flow, service state, resource consumption, vulnerability state, protection software, Trojans and other data information etc.

The calculation of weighted index system is an important step, which is based on an entropy weighted method in this paper. The entropy weighted method is a method that can determine the index weights according to their information transmitted to the maker at all levels. The information required by the method comes from all kinds of network security equipment, security event information and the awareness results have a lot of objectivity. We use this method to empower the indexes at all levels. The entropy value of awareness index is smaller that the index contains more information and transmission, and the corresponding weight is bigger. The lower level indexes by entropy method for network threat can be empowered as following steps.

In the database, the characteristic indexes are gathered and arranged according to a certain order, so the $N$ awareness indexes are taken at time $t,t+1,\ldots,t+m$ as a $m*n$ matrix:

$\displaystyle\left[{{\begin{array}[]{*{20}c}{a_{11}}&{a_{12}}&{\ldots}&{a_{1n}% }\\ {a_{21}}&{a_{22}}&{\ldots}&{a_{2n}}\\ {\ldots}&{\ldots}&{\ldots}&{\ldots}\\ {a_{m1}}&{a_{m2}}&{\ldots}&{a_{mn}}\\ \end{array}}}\right]$

In this formula, $a_{i1},a_{i2},\ldots,a_{in}$ are the third level awareness indexes. Every awareness index will be normalized according to following method, and can calculate the entropy value:

$\displaystyle s_{ij}=a_{ij}/\sum\limits_{i=1}^{m}a_{ij},i=1,2,3,\ldots m;j=1,2% ,3,\ldots,n.$ (1) $\displaystyle h_{j}=-\frac{1}{\ln m}\sum\limits_{i=1}^{m}s_{ij}\ln s_{ij},j=1,% 2,3,\ldots,n.$ (2)

Then, the entropy value will be converted to a weight value that can represent differences:

$\displaystyle w_{j}=\frac{1-h_{j}}{m-\sum\limits_{j=1}^{n}h_{j}},j=1,2,\ldots,n.$ (3)

The index of network threat can be determined by all the lower network indexes, and the formula at time $t$ is:

$\displaystyle e(t)=\sum\limits_{i}^{n}g(a_{i}(t))\times w_{i}$ (4)

It is converted to a normalized value of lower index at time $t$ in formula $g(a_{i}(t))$ . Similarly, it can be used to determine the network basis running indexes, network vulnerability indexes according to the same method. Moreover, it can also use the entropy method to determine the higher index network weights that will fuse all the information to form a whole network security situation.

4. Grey model

Grey theory has been proposed by Chinese scholars Yang et al. [14]. The related research works have been developed quickly, and now grey theory becomes a complete theoretical system. It takes the “small sample” or “poor information” for uncertain system as a research object, which is known as some partial unambiguous or ambiguous information. Mainly through generation, development, extraction for the certain valuable information, grey theory may achieve a correct understanding and description for the system, in order to carry out scientific prediction. The prediction model of grey theory is based on some known or unknown network information of past or present. It can establish a GM model from past to future, so as to determine a future system development, providing a basis for decisions.

4.1 Grey model GM (1,1)

GM (1,1) model is a main foundation in the grey prediction model and the prediction process is shown as follows: a sequence of $i$ changes with time $k$ is $X_{i}^{(0)}=(x_{i}^{(0)}(1),x_{i}^{(0)}(2),\ldots,x_{i}^{(0)}(k))$ , here $i=2,\ldots N$ , $k=1,2,\ldots n$ . For the accumulation of this series, we have $X_{i}^{(1)}=(x_{i}^{(1)}(1),x_{i}^{(1)}(2),\linebreak\ldots,x_{i}^{(1)}(k))$ , $x_{i}^{(1)}(k)=\sum\limits_{j=1}^{k}{x_{i}^{(0)}}(j)$ , from the $i$ ’s sequence $X_{i}^{(1)}$ , we can construct a first order white differential equation:

$\displaystyle dx_{i}^{(1)}(t)/dt+ax_{i}^{(1)}(t)=b$ (5)

Its solution is:

$\displaystyle x_{i}^{(1)}(t)=(x_{i}^{(1)}(1)-b/a)e^{-at}+b/a$ (6)

Corresponding to the grey differential equation, it has:

$\displaystyle x_{i}^{(0)}(k)+az_{i}^{(1)}=b\text{ and }z_{i}^{(1)}=(x_{i}^{(1)% }(k-1)+x_{i}^{(1)}(k))/2$ (7)

Among them, $a$ is a development coefficient, and $b$ is used as the amount of ash. $z_{i}^{(1)}$ is a background value of $x_{i}^{(1)}(k)$ in domain $[k-1,k]$ . Let $\alpha=[a,b]^{T}$ , according to the least square method, it has $\alpha=(B^{T}B)^{-1}B^{T}Y$ . Thus, we have:

$\displaystyle B=\left[{{\begin{array}[]{*{20}c}{z_{i}^{(1)}(2)}&1\\ {z_{i}^{(1)}(3)}&1\\ {\ldots}&{\ldots}\\ {z_{i}^{(1)}(k)}&1\\ \end{array}}}\right],\quad Y=\left[{{\begin{array}[]{*{20}c}{x_{i}^{(0)}(2)}\\ {x_{i}^{(0)}(3)}\\ {\ldots}\\ {x_{i}^{(0)}(k)}\\ \end{array}}}\right]$

The predictive value of grey differential equation can be obtained from:

$\displaystyle\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}% \over{x}}_{i}^{(1)}(k+1)=(x_{i}^{(1)}(1)-b/a)e^{-ak}+b/a$ (8)

Then, the predictive value of original sequence $X_{i}^{(0)}$ in the time $k+1$ of the original sequence are obtained:

$\displaystyle\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}% \over{x}}_{i}^{(0)}(k+1)=(1-e^{a})(x_{i}^{(0)}(1)-b/a)e^{-a(k+1)}$ (9)

Concerning the prediction for network security situation, this paper starts with the three level awareness indexes. Then, using the history information of the three level indexes along with time sequence for gray GM (1,1) model, it can gain the three level index value for prediction at time $k+1$ , that can be treated as an input data source of next gray neural network.

4.2 Grey neural network model

Grey neural network is established by combining a BP neural network with the solution of a differential equation for grey model. With a case: $X_{i}^{(0)}=(x_{i}^{(0)}(1),x_{i}^{(0)}(2),\ldots x_{i}^{(0)}(k))$ , $i=1,2,3,\ldots,N$ , the sequence will be processed according to accumulation with the same way, that it can establish an $N$ white differential equation:

$\displaystyle dx_{1}^{(1)}/dt+ax_{1}^{(1)}=b_{1}x_{2}^{(1)}+b_{2}x_{3}^{(1)}+,% \ldots,+b_{N-1}x_{N}^{(1)}$ (10)

Corresponding to the grey differential equation, it has:

$\displaystyle x_{1}^{(0)}(k)+az_{1}^{(1)}=\sum\limits_{i=2}^{N}{b_{i-1}}x_{i}^% {(1)}(k)$ (11)

This is a gray differential equation model with $N$ variables. The initial parameter matrix is:

$\displaystyle B=\left[{{\begin{array}[]{*{20}c}{-z_{1}^{(1)}(2)}&{x_{2}^{(1)}(% 2)}&{\ldots}&{x_{N}^{(1)}(2)}\\ {-z_{1}^{(1)}(3)}&{x_{2}^{(1)}(3)}&{\ldots}&{x_{N}^{(1)}(3)}\\ {\ldots}&{\ldots}&{\ldots}&{\ldots}\\ {-z_{1}^{(1)}(k)}&{x_{2}^{(1)}(k)}&{\ldots}&{x_{N}^{(1)}(k)}\\ \end{array}}}\right],\quad Y=\left[{{\begin{array}[]{*{20}c}{x_{1}^{(0)}(2)}\\ {x_{1}^{(0)}(3)}\\ {\ldots}\\ {x_{1}^{(0)}(k)}\\ \end{array}}}\right]$

Here, $z_{1}^{(1)}=(x_{1}^{(1)}(k-1)+x_{1}^{(1)}(k))/2$ . The parameter vector $\beta=[a,b_{1},b_{2},\ldots,b_{N-1}]^{T}$ can be deled with according to the least square method:

$\displaystyle\beta=(B^{T}B)^{-1}B^{T}Y$ (12)

The discrete solution of grey differential equation at the moment $k+1$ is a time response for prediction:

$\displaystyle\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}% \over{x}}_{1}^{(1)}(k+1)=(x_{1}^{(0)}(1)-b_{1}\mathord{\buildrel\lower 3.0pt% \hbox{$\scriptscriptstyle\frown$}\over{x}}_{2}^{(1)}(k+1)/a-b_{2}\mathord{% \buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}\over{x}}_{3}^{(1)}(k+1)% /a-\ldots{}-b_{N-1}\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle% \frown$}\over{x}}_{N}^{(1)}(k+1)/a)e^{-a(k+1)}+b_{1}\mathord{\buildrel\lower 3% .0pt\hbox{$\scriptscriptstyle\frown$}\over{x}}_{2}^{(1)}(k+1)/a+b_{2}\mathord{% \buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}\over{x}}_{3}^{(1)}(k+1)% /a+\ldots{}+b_{N-1}\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle% \frown$}\over{x}}_{N}^{(1)}(k+1)/a$ (13)

BP neural network can be mapped to an extended neural network, which can be multiplied by $1/e^{-a(k+1)}$ in the same time. It has:

$\displaystyle\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}% \over{x}}_{1}^{(1)}(k+1)=((x_{1}^{(1)}(1)-d)e^{-a(k+1)}/(1+e^{-a(k+1)})+d/(1+e% ^{-a(k+1)}))(1+e^{-a(k+1)})$ $\displaystyle\quad=((x_{1}^{(0)}(1)-d)-x_{1}^{(0)}(1)/(1+e^{-a(k+1)})+2d/(1+e^% {-a(k+1)}))(1+e^{-a(k+1)})$ (14) $\displaystyle d=b_{1}\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle% \frown$}\over{x}}_{2}^{(1)}(k+1)/a+b_{2}\mathord{\buildrel\lower 3.0pt\hbox{$% \scriptscriptstyle\frown$}\over{x}}_{3}^{(1)}(k+1)/a+\ldots+b_{N-1}\mathord{% \buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}\over{x}}_{N-1}^{(1)}(K+% 1)/a$ (15)

An extended neural network, which is a gray neural network, can be mapped to an extended BP neural network. A gray neural network is shown as Fig. 3.

Figure 3.

The structure of grey neural network.

Here, $k+1$ is an input serial number, $w_{11},w_{21},w_{22},\ldots,w_{2N},w_{31},w_{32},\ldots,w_{3N}$ are the input of neural network $\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}\over{x}}_{2}^{% (1)}(k+1),\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}\over% {x}}_{3}^{(1)}(k+1),\ldots,\mathord{\buildrel\lower 3.0pt\hbox{$% \scriptscriptstyle\frown$}\over{x}}_{N}^{(1)}(k+1)$ , and $\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}\over{x}}_{1}^{% (1)}(k+1)$ is an output; LA, LB, LC, LD are the four layers structure of gray neural network respectively. According above, the network weights are $w_{11}=a$ , $w_{21}=-x_{1}^{(0)}(1)$ , $w_{2N}=2b_{N-1}/a$ , $w_{31}=w_{32}=\ldots=w_{3N}=1+e^{-a(k+1)}$ respectively. The threshold value of LD output nodes can be described as $Q=(1+e^{-a(k+1)})(d-x_{1}^{(0)}(1))$ . Finally, the predictive value $\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}\over{x}}_{1}^{% (1)}(k+1)$ of cumulative reduction can be obtained for the prediction value as follows:

$\displaystyle\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}% \over{x}}_{1}^{(0)}(k+1)=\mathord{\buildrel\lower 3.0pt\hbox{$% \scriptscriptstyle\frown$}\over{x}}_{1}^{(1)}(k+1)-\mathord{\buildrel\lower 3.% 0pt\hbox{$\scriptscriptstyle\frown$}\over{x}}_{1}^{(1)}(k)$ (16)

5. Prediction algorithm of grey neural network

Grey theory is an applied mathematics subject which studies the information with part clear, part unclear and uncertain phenomena. It is very suitable for network security situation prediction. Grey neural network combines the advantages and characteristics of grey model with neural network, in order to predict network security situation in the cloud environment. The grey model advantages lie in the model that requires fewer samples, without considering the sample sequence distribution or change trend mainly. It can achieve the target of coarse prediction, holding the characteristics of simple modeling and running convenience. The advantages of neural network mainly lie in self-learning, self-organizing, adaptive, nonlinear processing ability that can make up for the shortcomings of non-linear fitting of gray model. Therefore, the grey neural network model may be established by using grey neural network. In this paper, the prediction thinking or algorithm can be considered as followings.

In network security situation prediction, the results may be affected by many factors. Generally, the GM (1,1) model can not meet the requirement of network security situation prediction. Concerning this problem, it can predict network security situation using grey neural network with multi-factor correlation. To create the gray neural network prediction model, grey GM (1,1) model is obtained from the three grade index prediction value. Here, $\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\frown$}\over{x}}_{i}^{% (0)}(k+1)$ can be treated as the input data of gray neural network, and the results of grey neural network output may be obtained as the network security situation prediction value $\tilde{x}_{1}^{(0)}(k+1)$ . With the mean square error of actual network security situation $x_{1}^{(0)}(k+1)$ as a target, neural network parameters can be optimized by the BP error feedback algorithm, and the weight and threshold value of mean square error may be obtained. Then, it can predict network security situation using the optimized gray neural network.

The prediction algorithm is shown as follows step by step:

(1)
Collecting the network security situation samples, the awareness indexes of hierarchical network security situation can be treated as the input data of network security situation;
(2)
Inputting the three level historical data sequences. By using GM (1,1) model to predict the three level index, the predicted value $x_{i}^{(0)}(k+1)$ can be obtained;
(3)
Determining the number of samples(including the training samples, the prediction samples), it can be treated as a input data of gray neural network model with three indicators of the prediction value, that the output results may be the prediction value of network security situation. All the training samples are generated as a goal from the mean square error, combined with the BP error feedback algorithm to optimize the parameters. It can get the parameters of the gray neural network to achieve the setting value of mean square error;
(4)
Using the optimized neural network to predict the network security situation and get the situation prediction value.

6. Experimental analyses

6.1 Experimental deployment

In order to validate the feasibility and rationality of prediction method proposed for network security situation, we will set up an experimental environment with a lot of hosts, routers, switches, servers, that experimental network topology is shown as in Fig. 4.

Figure 4.

Experimental network topology.

Users and attackers can access to the hosts through Internet. There are many attacking targets located in the DMZ web servers, database servers, IDS servers, DNS servers, switches, network traffic monitoring servers, and network intrusion detection systems etc.

The data sources for experiment come from mainly: the flow information monitored by server NetFlow, the intrusion detection log information from snort network system, the scanning information of host Nessus vulnerability, and the log information of firewall. All kinds of malicious traffic will attack various DMZ servers in the experiment, including SQL injection vulnerability to attack the database server, flood WEB attacks UDP server, flood DNS attacks SYN server, etc. These attacks will produce a large number of abnormal requests to all kinds of servers in a moment, resulting in slow server response, CPU, memory resources depleted, that are unable to respond to normal request and may cause an entire network paralysis.

Grey neural network needs to be trained first and then to be predicted for simulation experiment based on MATLAB platform, taking full use of neural network toolbox in MATLAB. In this experiment environment, the security situation can be assessed quantitatively. Ordinary user (User) and attacker (Attacker) can access the network host through Internet. Specific attack steps are shown as follows:

Ordinary users can access the server Web server, the main database and the file transfer FTP server;

SQL injecting loopholes to attack the main database server;

SQL injecting attacks from the host (Host) to the backup database server Database2;

UDP flooding attacks Web server;

The worm attacks on the FTP server;

Using the 2) $\sim$ 5) to attack the main database, server Web server and FTP server etc.

Collecting the IDS attack information, Nessus scanning information, Snort log alarm information and router Netflow network traffic information, they can be treated as this simulation data sources with multi-source heterogeneous characteristic.

6.2 Data sampling

Attacker can launch various attacks against the network environment constantly. In order to draw clearly, it collects 300 samples dynamically every 10 seconds. Taking four observable indexes (CPU utilization, sub network bandwidth usage, average data flow of the sub network, scanning alert information) sample, the original data is shown in Fig. 5. When the host is attacked, the sample real time data may produce some fluctuation, that the data are related with each other.

Figure 5.

Original data sampling diagram.

In order to avoid original data rate of experimental process producing a negative impact, this paper will normalize to process the awareness data for security factor, that the data will range between [0, 1] after processing.

For the continuous original sample data in Fig. 5, it can be processed as a real number between 0 $\sim$ 1, that may get five corresponding discrete values. For ease of expression, the data in Fig. 4 are all translated to a corresponding position in a middle value. Instead of taking the discrete value directly, otherwise it may become a broken line, which the difference doesn’t be expression between the data, shown as in Fig. 6. After discretization, the data in the corresponding discrete value are near upper or lower amplitude fluctuations. In this application, the data on fluctuation of level of $i$ near the upper and lower levels can be taken as the discrete value $i$ , which is convenient and easy to operate.

Figure 6.

Discrete data sampling diagram.

6.3 Experimental parameters

In this paper, the method of determining experimental parameters is: by designing the first 7 factors, it can predict the network security situation combined with three level awareness indexes, which can determine the input ports of gray neural network, and the output port will be 1. Thus, it selects historical awareness index values and relevant values of network security situation to form many sample pairs. The number of samples H is 115 pairs, including 93 pairs of training samples and 8 pairs of prediction samples. The learning step for BP error feedback algorithm is set to 0.01, the maximum number of iterations is 1000, and the mean square error is set to 0.001. Grey GM (1,1) model and BP neural network are used to predict the network security situation from number 108 to 115 during prediction process.

6.4 Analysis experimental results

By collecting samples, BP neural network can be established with large sample training, which may obtain the corresponding parameters. The comparisons of BP neural network training process compared with other algorithms sample training renderings and their corresponding changes in training time are shown as Fig. 7.

Through the analysis of above experimental results, we can draw the following conclusions:

1)
Using the BP neural network model based on grey theory (improved algorithm), it is easy to be trained by obtaining external prediction data, which is a high convergence efficient algorithm, and the speed is better than that of other two algorithms, indicating that the prediction model is feasible.
2)
The trained prediction model of BP neural network, using a number of temporary monitoring data, can easily predict the network security situation at a moment. So the improved algorithm is effective obviously.

When collecting a set of running-time data dynamically, it can use the trained BP neural network model based on grey theory to predict security situation conveniently. The prediction result of network security situation is shown as Fig. 8, and the test result is shown as Table 1.

Table 1
Test result

Algorithm type Maximum relative residual error Average relative residuals

GM (1,1) model 0.910 0.480

BP model 0.280 0.160

Grey neural network 0.136 0.069

Figure 7.
Training effect diagram of each model.

Figure 8.
Results of network security situation prediction.

From the Fig. 8 and Table 1, we can learn that network security situation expectations and the predicted value increase significantly and the network security state becomes worse gradually, when the malicious data will flow into this network. When a malicious attack with various types is paralyzed, the situation will reach the maximum. However, error test of prediction results for each model is not same way, gray GM (1,1) model and BP neural network model are relatively larger than that of the difference expected value. Thus, the relative residual maximum value is larger comparatively than the average relative residual error. Due to combining the advantages of gray model with neural network model, the prediction accuracy of grey neural network is higher significantly than that of other models.
7. Conclusion

Algorithm type	Maximum relative residual error	Average relative residuals
GM (1,1) model	0.910	0.480
BP model	0.280	0.160
Grey neural network	0.136	0.069

Network security situation prediction can automatically assess and predict in advance, discover abnormal events in time, grasp network security status in real time, reduce network security risks, and improve network security protection capability. The prediction method in the cloud environment proposed in this paper has some difficulties in practical application: Firstly, the prediction results are subject to the constraints of extracting security elements, if the selection of security elements is not comprehensive, it will affect the awareness results of network security situation seriously; Secondly, during the association processing of security factors and rules matching, how to eliminate redundant or false information effectively and how to assess indexes or weights effectively, and how to obtain real network security situation value are more difficult; Thirdly, in the sample selection for training process of grey neural network, if the selection of training samples does not contain historical data series with all attributes, the result accuracy may be not high obviously.

References

Bass

, Intrusion detection systems and multisensory data fusion: Creating cyberspace situational awareness, Communications of the ACM 43(4) (2000), 99–105.

Liu

Y.L.

et al., Network situation prediction method based on spatial-time dimension analysis, Journal of Computer Research and Development 51(8) (2014), 1681–1694.

Yurick

et al., UCLog+: A Security Data Management System for Correlating Alerts, Incidents, and Raw Data From Remote Logs, Computing research repository, 2006, https://arxiv.org/ftp/cs/papers/0607/0607111.pdf.

Jansen

et al., Situation awareness as an ignored factor in the behavioral consistency paradigm underlying the validity of personnel selection procedures, Journal of Applied Psychology 98(2) (2013), 326–328.

Guo

C.X.

and Su

, A new optimized algorithm based on quantum evolutionary strategy for network security situation prediction, Journal of Chinese Computer Systems 35(6) (2014), 1248–1252.

Xie

L.X.

Wang

Y.C.

and Yu

J.B.

, Network security situation awareness based on neural networks, J Tsinghua Univ (Sci & Technol) 53(12) (2013), 1750–1760.

Huang

T.Q.

and Zhuang

, An approach to real-time network security situation prediction, Journal of Chinese Computer Systems 35(2) (2014), 303–306.

R.R.

Yun

X.C.

Zhang

Y.Z.

and Hao

Z.Y.

, An improved quantitative evaluation method for network security, Chinese Journal of Computers 38(4) (2015), 749–758.

, Network security situation awareness model Based on grey theory, Ph.D. Dissertation, Hunan University, 2009.

10.

Wei

et al., A network security situational awareness model based on information fusion, Journal of Computer Research and Development 46(3) (2009), 353–362.

11.

R.G.

Zhang

H.Q.

Yang

Y.J.

and Liu

Y.L.

, Quantitative method for network security situation based on attack prediction, Journal on Communications 38(10) (2017), 122–133.

12.

Wang

C.Y.

, Awareness of network security situation based on grey relational analysis and support vector machine, Application Research of Computers 30(6) (2013), 1860–1862.

13.

X.S.

and Jiang

, Hierarchical situation evaluation model for network information content security incidents, Journal of Jilin University (Engineering and Technology Edition) 46(2) (2016), 556–567.

14.

Yang

H.P.

Qiu

and Wang

, Network security situation evaluation method for multi-step attack, Journal on Communications 38(1) (2017), 187–198, 2002.