Abstract
Existing network security prediction methods for the cloud environment are limited in terms of both accuracy and real-time performance. In this paper, we address these issues with a proposal for a method based on grey neural network to predict network security situations in cloud environments. First, we explore security factors for network security situation awareness based on classification and fusion techniques in order to generate awareness indexes. Through this, we establish a hierarchical index system for network security situation. Then, a method is elaborated that combines grey theory and neural networks to predict network security situations by analyzing the features of grey and neural networks that combine high accuracy and real-time performance. Finally, through experiments with simulated data, a network prediction algorithm for security situations is verified. Results of experiments show that the method is both correct and feasible.
Keywords
Introduce
With the development of communication technology and cloud computer technology, the application of computer network becomes more and more extensive in every field of daily life. At the same time, network security incidents caused by attack or destruction maliciously also become more and more common and the security issues have become increasingly prominent. Especially, undesirable security vulnerabilities and security incidents will increase greatly. Thus, network is facing a serious security problem. Traditional equipments with limited protection functions such as IDS (Intrusion Detection Systems), firewall, security scanners, etc., have work in an independent or semi-independent state generally. Because there is little connection between each other about communication for information, it can not meet the needs of network security. How to master the state of whole network security comprehensively promptly is an important problem to be faced. There is no common cognition (an important understanding for mastering the whole network state) for protecting network resources, and this cognitive disconnect will greatly delay the administrator to make judgment or to deal with the threat in best time. Moreover, it can help management as soon as possible to have a clear understanding of the network situation, which is expected further more to solve the problems of network security.
Situation Awareness (SA) was derived from the study of space flight. Since then, in the fields of military battlefield, nuclear reaction control system, air traffic control and medical emergency dispatch and so on, this technology has also been used widely. The concept of Cyberspace Situation Awareness (CSA) was proposed for the first time by Tim Bass in 1999 [1]. He pointed out that the network situation awareness based on fusion will become a development direction of network management. Situation emphasizes the connection between environment and entity, which is a dynamic state, a trend, a whole and global concept. Any single case or state can not be called situation. Situation awareness has become a hot research topic at present. Because in a dynamic and complex network environment, decision makers can use situation awareness tools to master a global state change, in order to make right decisions.
At present, network security situation awareness does not have a unified, comprehensive definition. By extracting the related factors of network and security, it can analyze and understand the whole network security state, and predict the future development of network security situation with the method of network security situation awareness. Its research focuses on the efficient organization and overall awareness for all complex information, whose purpose is to shorten the time from obtaining information to make decisions. It mainly applies information fusion technology that takes advantage of intrusion detection log, firewall log, virus log, network scanning data, illegal external link data, and running state of equipment and real-time alarm information, multi-source heterogeneous observational data.
Network security situation awareness need to make an accurate strategy for assessment, prediction. It can protect network healthy to run, which has become a very important factor to affect social and economic development.
Traditional network security situation awareness provides a lot of analysis convenience for security administrator, but most of them are based on the analysis of system logs. There exist many problems that data source is single, real-time performance is poor, and awareness results are too dependent on an experience of network management personnel. At present, the research direction of network security has changed from passive security system construction to intrusion detection, defense attack. Situation awareness is an active security system construction. Thus, research on security situation of global network has been transferred to a single security problem. But the security situation awareness is still in a start stage at home or abroad, the relevant technical theory has not be mature, so it is urgent to find an effective and accurate awareness method.
Network security situation prediction is a general prediction of the future development trend of network security, and a useful supplement to network security situation awareness. At present, traditional prediction method of network security situation is mainly lack of real-time. Existing network security situation prediction methods can not accurately reflect the changes of future security situation [2]. Moreover, it is not very good to deal with the relationship between security elements and future network security situation. In this paper, a prediction method of network security situation based on temporal and spatial analysis is proposed.
Concerning the problem that existing network security situation prediction in the cloud environment has low accuracy and real-time, a method based on grey neural network to predict network security situation in the cloud environment is proposed, which is fast and efficient, and can be fused with mul-levels of multi-source and heterogeneous information. In this paper, a method combining grey theory with neural network to predict network security situation is used to analyze the features of grey and neural network, holding high accuracy and real-time, that can show a macro view, strengthen to understand the network development. Moreover, it can yet provide a reliable reference easily for understanding future network security and provide a reliable decision support for administrators.
Related work
Network security situation awareness can represent a whole network from macro, overall and current running state that can also predict next phase of network security state. Research on network security situation awareness includes two aspects mainly: situation awareness and situation prediction. Among them, before network security situation prediction, it must collect a lot of network security elements in some time or space. Generally, we first obtain the data of relevant security situation factors, to be used as classification and integration in order to get an overall network security state. Then, we can analysis and predict its future development trend through numerical and graphic forms. Network security situation awareness and prediction can provide a reliable basis for network administrators to make policy for network security. About network security situation prediction, researchers have put forward a large number of prediction methods and models from multi-angle and multi-dimension prospectively.
In many countries, research on network security situation awareness, mainly accesses to distribute in heterogeneous information network in various types of safety equipment using different data fusion technology. By establishing multiple awareness or prediction models, it can achieve the purpose of network security situation awareness. Representatively, Bass established a network model for intrusion detection system with heterogeneous information to assess current network security situation. Cristina’s Abad [3] uses Unclog+ to design a system for network security situation awareness. Through information collection for network security events of security sensor and correlation analysis, it can fuse this information to gain network security situation.
At present, many valuable research work has been carried out about the network security situation awareness and prediction method [4, 5], obtaining certain achievements, which will guide these research work in future. A new method of network security situation awareness based on neural network is proposed, which aims at the problem of time and spatial complexity for network security situation awareness in paper [6]. In paper [7], a real time HMM-NSSP model based on HMM model is proposed from a combination of theory and practice. HMM-NSSP model is used to establish a HMM predication model, and the parameters of network nodes are dynamically modified in order to monitor and predict the whole network security situation in time. In the network security situation awareness based on Hidden Markov model, it improves the method establishment with sequence acquisition and transfer matrix [8]. The risk value of improved algorithm is more reasonable to the network security situation.
Pu [9] uses a gray theory model GM (1,1) to predict network security situation through mining the related information from random time change series of network security situation. Wei et al. [10] puts a time series analysis and prediction method applied to the prediction for network security situation, whose principle is to predict future development according to the change trend of historical data series itself. Hu et al. [11] uses Bayesian network to represent the graph model of network security events, describing connection probability that the probability of security occurrence at next moment is derived by using this model. Wang [12] predicts network security situation using SVM model that holds the advantages of fast convergence speed.
From above prediction methods, there is some common shortcoming that it considers network security situation itself or security situation sequence variation only, without taking into accounts the other security factors. The insufficient information will cause that the prediction result and actual result have very big difference, so the prediction precision is not high. Concerning above problems, this paper proposes a method based on grey neural network to predict network security situation. This method takes full use of every security factor of network security situation, combined with grey model GM (1,1) and multi-factors.
Network security situation awareness
Gaining situation factors
It is very important to construct a reasonable index system of security situation before awareness or prediction for network security situation. Different security factors, different quantitative methods, different network weighted methods, different fusion methods and awareness models will result in different network security situation that may influence the accuracy of awareness or prediction.
Network security situation can be determined by a variety of situation factors, including such as hacker attacks, viruses, Trojans, and malicious code, that they can determine network security situation. Due to network holding complexity, device diversity, network security incidents diversity, it is necessary to take into account the network security elements from multi-angle information [13]. Note that situation factor is a subset of security factors.
The fusion information of network security, taking full into account all network aspects, mainly comes from network key nodes, including firewall, intrusion detection system and all kinds of scanning system. Network data flow information used for information fusion includes such as packet size and distribution, packet loss rate, data flow and flow change ratio, protocol distribution, data flows, outflow rate of growth, IP address of data source, distribution, and subnet bandwidth usage ratio. Alarm information from security equipments includes such as security incidents, Trojans, viruses and other malicious attacks frequency. The network vulnerability includes such as the number of network vulnerabilities, and the number of key equipment vulnerabilities. Security device configuration information includes such as the number of open ports, the number of key devices, the number of security devices in the sub network, the number of anti-virus software.
Generally, the original heterogeneous data collected from network are complex, diverse, irrelevant, false and redundant. Therefore, according to certain rules, we may use mathematical methods to extract network security information, in order to obtain the main indexes of network security situation. The extraction method used in this paper is mainly through correlation analysis, rule matching in order to eliminate the redundant, false, irrelevant security factors.
There are a number of network equipments in network, but the relationship between various devices is complex. Thus, the method for direct analysis data is not desirable. So, it is necessary to deal with the collected data in order that they can be converted into the situation awareness indexes before situation awareness or prediction. The first step for network security situation awareness is to obtain original security factors. Due to amount information, it needs to carry on a comprehensive quantitative classification processing, in order that every factor may be selected as a situation awareness index. Security factors for quantitative calculation method mainly refer to: change rate of network traffic, packet loss rate of data streams, vulnerabilities, and security equipment quantity statistics, malicious code, security event Trojan virus etc. After quantitative process for network security factors, all factors may be classified into various indexes for situation awareness. An integration model for network security factors is shown as Fig. 1.
Integrated model of network security situation factors.
After eliminating redundancy, quantitative processing, fusion classification, the awareness indexes can be used to assess network security situation. This paper selects different levels (nodes, hosts, and the whole network), different information sources (host running state, flow, alarm, log, and asset allocation information) and different services according to the hierarchical and structural network system in order to assess network security situation. We will describe network security situation and its change regularities quantitatively through all the awareness indexes in this paper. The hierarchical architecture for awareness indexes is shown as Fig. 2.
Hierarchical architecture of network security situation awareness index.
The architecture can be divided into 3 levels from bottom to up shown in Fig. 2. In fact, network security situation prediction is a process of data fusion. From part to whole, it can be gradually converged into the overall network security situation index. The main macro indexes are composed of three kinds, which include: network basic running index, network vulnerability index and network threat index. Each of second level indexes is weighted by the third level indexes. The basic running index shows the value of network internal running state itself, the vulnerability index shows the network potential value, and the threat index shows the value of network attack. From Fig. 2, network security situation is in the first level, based on the basic running index, vulnerability index, threat index that can construct the three dimensions for index. Each of the three level indices is weighted from the relevant bottom data. The bottom layer of data source is composed of many security factors that affect network security situation greatly, mainly including certain flow, service state, resource consumption, vulnerability state, protection software, Trojans and other data information etc.
The calculation of weighted index system is an important step, which is based on an entropy weighted method in this paper. The entropy weighted method is a method that can determine the index weights according to their information transmitted to the maker at all levels. The information required by the method comes from all kinds of network security equipment, security event information and the awareness results have a lot of objectivity. We use this method to empower the indexes at all levels. The entropy value of awareness index is smaller that the index contains more information and transmission, and the corresponding weight is bigger. The lower level indexes by entropy method for network threat can be empowered as following steps.
In the database, the characteristic indexes are gathered and arranged according to a certain order, so the
In this formula,
Then, the entropy value will be converted to a weight value that can represent differences:
The index of network threat can be determined by all the lower network indexes, and the formula at time
It is converted to a normalized value of lower index at time
Grey theory has been proposed by Chinese scholars Yang et al. [14]. The related research works have been developed quickly, and now grey theory becomes a complete theoretical system. It takes the “small sample” or “poor information” for uncertain system as a research object, which is known as some partial unambiguous or ambiguous information. Mainly through generation, development, extraction for the certain valuable information, grey theory may achieve a correct understanding and description for the system, in order to carry out scientific prediction. The prediction model of grey theory is based on some known or unknown network information of past or present. It can establish a GM model from past to future, so as to determine a future system development, providing a basis for decisions.
Grey model GM (1,1)
GM (1,1) model is a main foundation in the grey prediction model and the prediction process is shown as follows: a sequence of
Its solution is:
Corresponding to the grey differential equation, it has:
Among them,
The predictive value of grey differential equation can be obtained from:
Then, the predictive value of original sequence
Concerning the prediction for network security situation, this paper starts with the three level awareness indexes. Then, using the history information of the three level indexes along with time sequence for gray GM (1,1) model, it can gain the three level index value for prediction at time
Grey neural network is established by combining a BP neural network with the solution of a differential equation for grey model. With a case:
Corresponding to the grey differential equation, it has:
This is a gray differential equation model with
Here,
The discrete solution of grey differential equation at the moment
BP neural network can be mapped to an extended neural network, which can be multiplied by
An extended neural network, which is a gray neural network, can be mapped to an extended BP neural network. A gray neural network is shown as Fig. 3.
The structure of grey neural network.
Here,
Grey theory is an applied mathematics subject which studies the information with part clear, part unclear and uncertain phenomena. It is very suitable for network security situation prediction. Grey neural network combines the advantages and characteristics of grey model with neural network, in order to predict network security situation in the cloud environment. The grey model advantages lie in the model that requires fewer samples, without considering the sample sequence distribution or change trend mainly. It can achieve the target of coarse prediction, holding the characteristics of simple modeling and running convenience. The advantages of neural network mainly lie in self-learning, self-organizing, adaptive, nonlinear processing ability that can make up for the shortcomings of non-linear fitting of gray model. Therefore, the grey neural network model may be established by using grey neural network. In this paper, the prediction thinking or algorithm can be considered as followings.
In network security situation prediction, the results may be affected by many factors. Generally, the GM (1,1) model can not meet the requirement of network security situation prediction. Concerning this problem, it can predict network security situation using grey neural network with multi-factor correlation. To create the gray neural network prediction model, grey GM (1,1) model is obtained from the three grade index prediction value. Here,
The prediction algorithm is shown as follows step by step:
Collecting the network security situation samples, the awareness indexes of hierarchical network security situation can be treated as the input data of network security situation; Inputting the three level historical data sequences. By using GM (1,1) model to predict the three level index, the predicted value Determining the number of samples(including the training samples, the prediction samples), it can be treated as a input data of gray neural network model with three indicators of the prediction value, that the output results may be the prediction value of network security situation. All the training samples are generated as a goal from the mean square error, combined with the BP error feedback algorithm to optimize the parameters. It can get the parameters of the gray neural network to achieve the setting value of mean square error; Using the optimized neural network to predict the network security situation and get the situation prediction value.
Experimental deployment
In order to validate the feasibility and rationality of prediction method proposed for network security situation, we will set up an experimental environment with a lot of hosts, routers, switches, servers, that experimental network topology is shown as in Fig. 4.
Experimental network topology.
Users and attackers can access to the hosts through Internet. There are many attacking targets located in the DMZ web servers, database servers, IDS servers, DNS servers, switches, network traffic monitoring servers, and network intrusion detection systems etc.
The data sources for experiment come from mainly: the flow information monitored by server NetFlow, the intrusion detection log information from snort network system, the scanning information of host Nessus vulnerability, and the log information of firewall. All kinds of malicious traffic will attack various DMZ servers in the experiment, including SQL injection vulnerability to attack the database server, flood WEB attacks UDP server, flood DNS attacks SYN server, etc. These attacks will produce a large number of abnormal requests to all kinds of servers in a moment, resulting in slow server response, CPU, memory resources depleted, that are unable to respond to normal request and may cause an entire network paralysis.
Grey neural network needs to be trained first and then to be predicted for simulation experiment based on MATLAB platform, taking full use of neural network toolbox in MATLAB. In this experiment environment, the security situation can be assessed quantitatively. Ordinary user (User) and attacker (Attacker) can access the network host through Internet. Specific attack steps are shown as follows:
Ordinary users can access the server Web server, the main database and the file transfer FTP server; SQL injecting loopholes to attack the main database server; SQL injecting attacks from the host (Host) to the backup database server Database2; UDP flooding attacks Web server; The worm attacks on the FTP server; Using the 2)
Collecting the IDS attack information, Nessus scanning information, Snort log alarm information and router Netflow network traffic information, they can be treated as this simulation data sources with multi-source heterogeneous characteristic.
Attacker can launch various attacks against the network environment constantly. In order to draw clearly, it collects 300 samples dynamically every 10 seconds. Taking four observable indexes (CPU utilization, sub network bandwidth usage, average data flow of the sub network, scanning alert information) sample, the original data is shown in Fig. 5. When the host is attacked, the sample real time data may produce some fluctuation, that the data are related with each other.
Original data sampling diagram.
In order to avoid original data rate of experimental process producing a negative impact, this paper will normalize to process the awareness data for security factor, that the data will range between [0, 1] after processing.
For the continuous original sample data in Fig. 5, it can be processed as a real number between 0
Discrete data sampling diagram.
In this paper, the method of determining experimental parameters is: by designing the first 7 factors, it can predict the network security situation combined with three level awareness indexes, which can determine the input ports of gray neural network, and the output port will be 1. Thus, it selects historical awareness index values and relevant values of network security situation to form many sample pairs. The number of samples H is 115 pairs, including 93 pairs of training samples and 8 pairs of prediction samples. The learning step for BP error feedback algorithm is set to 0.01, the maximum number of iterations is 1000, and the mean square error is set to 0.001. Grey GM (1,1) model and BP neural network are used to predict the network security situation from number 108 to 115 during prediction process.
Analysis experimental results
By collecting samples, BP neural network can be established with large sample training, which may obtain the corresponding parameters. The comparisons of BP neural network training process compared with other algorithms sample training renderings and their corresponding changes in training time are shown as Fig. 7.
Through the analysis of above experimental results, we can draw the following conclusions:
Using the BP neural network model based on grey theory (improved algorithm), it is easy to be trained by obtaining external prediction data, which is a high convergence efficient algorithm, and the speed is better than that of other two algorithms, indicating that the prediction model is feasible. The trained prediction model of BP neural network, using a number of temporary monitoring data, can easily predict the network security situation at a moment. So the improved algorithm is effective obviously.
When collecting a set of running-time data dynamically, it can use the trained BP neural network model based on grey theory to predict security situation conveniently. The prediction result of network security situation is shown as Fig. 8, and the test result is shown as Table 1.
Test result
Training effect diagram of each model.
Results of network security situation prediction.
From the Fig. 8 and Table 1, we can learn that network security situation expectations and the predicted value increase significantly and the network security state becomes worse gradually, when the malicious data will flow into this network. When a malicious attack with various types is paralyzed, the situation will reach the maximum. However, error test of prediction results for each model is not same way, gray GM (1,1) model and BP neural network model are relatively larger than that of the difference expected value. Thus, the relative residual maximum value is larger comparatively than the average relative residual error. Due to combining the advantages of gray model with neural network model, the prediction accuracy of grey neural network is higher significantly than that of other models.
Network security situation prediction can automatically assess and predict in advance, discover abnormal events in time, grasp network security status in real time, reduce network security risks, and improve network security protection capability. The prediction method in the cloud environment proposed in this paper has some difficulties in practical application: Firstly, the prediction results are subject to the constraints of extracting security elements, if the selection of security elements is not comprehensive, it will affect the awareness results of network security situation seriously; Secondly, during the association processing of security factors and rules matching, how to eliminate redundant or false information effectively and how to assess indexes or weights effectively, and how to obtain real network security situation value are more difficult; Thirdly, in the sample selection for training process of grey neural network, if the selection of training samples does not contain historical data series with all attributes, the result accuracy may be not high obviously.
