CN104734916B

CN104734916B - A kind of high-efficiency multi-stage anomalous traffic detection method based on Transmission Control Protocol

Info

Publication number: CN104734916B
Application number: CN201510104409.2A
Authority: CN
Inventors: 徐光侠; 吴群; 刘宴兵; 常光辉; 李娜; 梁绍飞; 胡杰; 李来军; 高诗意
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2015-03-10
Filing date: 2015-03-10
Publication date: 2018-04-27
Anticipated expiration: 2035-03-10
Also published as: CN104734916A

Abstract

The invention requests protection of a high-efficiency multi-level abnormal flow detection method based on the TCP protocol, adding a multi-level abnormal flow detection mechanism to the traditional abnormal flow detection process. This method is to detect the abnormality of the data flow sent by the client in the network. It uses the differential mean method to perform differential stabilization on the original flow generated by the client, and at the same time analyzes and counts according to the existing flow in the network. Set an adaptive threshold interval, perform adaptive threshold differential traffic detection on the stabilized traffic, and perform further anomaly detection on the data packets that pass the primary detection. This further anomaly detection is mainly to analyze the data packet forwarded by the route, extract its key field, and further judge whether the data packet sent from the client is abnormal according to the judgment of the key field. The invention improves detection accuracy and is simple and easy to realize.

Description

An Efficient Multi-level Abnormal Traffic Detection Method Based on TCP Protocol

技术领域technical field

本发明属于通信异常检测技术领域，涉及互联网上各类异常检测的快速、实时异常检测技术，具体设计一种基于TCP协议的高效多级异常流量检测方法。The invention belongs to the technical field of communication anomaly detection, relates to fast and real-time anomaly detection technology for various anomalies on the Internet, and specifically designs a high-efficiency multi-level abnormal traffic detection method based on the TCP protocol.

背景技术Background technique

网络异常流量检测就是网络监测中的一个重要的部分。网络异常流量是指网络中的流量行为偏离正常行为的情形。网络中，引起网络流量异常的原因有很多，比如，网络中的设备出现故障，导致通信不正常，引起异常；网络操作异常，突发的访问(Flash crowd)，网络入侵等都会引起网络异常。同时，网络异常检测是网络不断发展壮大，网络拓扑结构的规划越来越复杂，网络设备越来越多样化，网络用户规模越来越大的发展过程中，通信安全的一个重要保障。网络用户在寻求网络的便利通信和对网络信任的同时，新类型的网络威胁也在不断增加。如何发现并排除这些网络威胁是网络异常检测的重要任务，也是保障网络正常通信的重要组成部分。Network abnormal traffic detection is an important part of network monitoring. Abnormal network traffic refers to the situation where the traffic behavior in the network deviates from the normal behavior. In the network, there are many reasons for abnormal network traffic. For example, equipment failure in the network leads to abnormal communication and abnormality; abnormal network operation, sudden access (Flash crowd), network intrusion, etc. will cause network abnormality. At the same time, network anomaly detection is an important guarantee for communication security during the continuous development of the network, the planning of the network topology is becoming more and more complex, the network equipment is becoming more and more diverse, and the scale of network users is increasing. While network users are looking for convenient communication and trust in the network, new types of network threats are also increasing. How to discover and eliminate these network threats is an important task of network anomaly detection and an important part of ensuring normal network communication.

网络面临的攻击及威胁主要来源于网络内部,如大量网络病毒、网内主机的主动攻击及网络异常流量的突增都将引起网络设备负荷过重,从而导致网络拥塞,并可能进一步导致网络瘫痪。SYN Flood DDoS攻击，就是网络不良用户利用TCP协议的三次握手连接存在的缺陷，伪造正常用户的IP地址，产生的攻击，从而给网络带来不可估量的损失。因此，当网络中存在异常时，首要措施是，找出这些异常，并且产生异常报警。同时，网络异常不会只对针对某一处进行攻击，而是会尽可能广的向周围扩散。它的最终的目的是波及最大范围的网络，产生多种类型的异常。针对这种情况，就需要一种实时、快速发现异常的检测方法，发现异常，截断异常，从而使网络得以正常通信。The attacks and threats faced by the network mainly come from the inside of the network. For example, a large number of network viruses, active attacks of hosts in the network and sudden increase of abnormal network traffic will cause overloading of network equipment, resulting in network congestion and may further lead to network paralysis. . A SYN Flood DDoS attack is an attack that a bad network user takes advantage of the defect in the three-way handshake connection of the TCP protocol to forge the IP address of a normal user, thereby causing immeasurable losses to the network. Therefore, when there are abnormalities in the network, the first measure is to find out these abnormalities and generate abnormal alarms. At the same time, network anomalies will not only attack a certain place, but will spread to the surrounding as wide as possible. Its ultimate purpose is to spread to the largest extent of the network, resulting in multiple types of anomalies. Aiming at this situation, a real-time and fast detection method for finding abnormalities is needed, and the abnormalities are found and truncated, so that the network can communicate normally.

网络异常流量的特点是流量突发性变化，先兆特征未知，可以在短暂的时间内给网络或网络上的计算机带来极大的危害，因此实时、快速地检测网络流量的异常行为，判断引起异常的原因，做出合理的响应是保证网络有效运行的前提之一，而降低网络恶意攻击带来的损失是保障网络安全的另一个重要方面。Abnormal traffic on the network is characterized by sudden changes in traffic, and the characteristics of the precursor are unknown, which can bring great harm to the network or computers on the network in a short period of time. Therefore, the abnormal behavior of network traffic can be detected in real time and quickly, and the cause Making a reasonable response is one of the prerequisites to ensure the effective operation of the network, and reducing the losses caused by malicious network attacks is another important aspect of ensuring network security.

目前，已提出的异常检测方法，如非线性异常流量检测方法(NLPP)、基于小波分析的异常流量检测方法、基于ARMA模型的异常流量检测方法等，虽然能够实时快速的检测出异常，但计算复杂度较高，同时检测的结果不够精确，往往存在较大的误报率，并且检测方法需要当流量数据存在长相关特性时才能使用。而大多数流量数据在采集时，呈现的相关性特征并不明显，波动趋势常呈现出非平稳状态，使得检测方法的使用范围局限性很大。本发明提出的异常检测方法，在进行流量检测前，对流量数据进行预处理，从而能够克服检测局限性的问题。同时，在传统检测方法的检测的基础上，提出的多级检测机制，有效的降低了检测的误报率。At present, the anomaly detection methods that have been proposed, such as the nonlinear abnormal flow detection method (NLPP), the abnormal flow detection method based on wavelet analysis, and the abnormal flow detection method based on the ARMA model, can detect the abnormality quickly in real time, but the computational The complexity is high, and the detection results are not accurate enough, and there is often a large false alarm rate, and the detection method can only be used when the traffic data has long-term correlation characteristics. However, when most flow data are collected, the correlation characteristics are not obvious, and the fluctuation trend often presents a non-stationary state, which makes the scope of use of the detection method very limited. The anomaly detection method proposed by the present invention preprocesses the flow data before the flow detection, so as to overcome the problem of detection limitation. At the same time, on the basis of traditional detection methods, the proposed multi-level detection mechanism effectively reduces the false alarm rate of detection.

发明内容Contents of the invention

针对现有技术中的不足，本发明的目的在于提供一种确保正确率的同时降低误报率，方法简单且易于实现的方法，本发明的技术方案如下：一种基于TCP协议的高效多级异常流量检测方法，其包括以下步骤：In view of the deficiencies in the prior art, the purpose of the present invention is to provide a method that ensures the correct rate while reducing the false alarm rate, the method is simple and easy to implement, the technical solution of the present invention is as follows: a high-efficiency multi-stage TCP protocol-based Abnormal traffic detection method, which includes the following steps:

101、在时间段T内收集网络流量数据，然后对于网络流量数据中的原始流量序列R，在时刻t的观测值用x_t表示，x_t∈R,t＝1,2,…,T，按照|x_t|>kvar_R准则去除不可用的流量数据值x_t，其中k表示的是格拉布斯准则系数，var_R表示所述序列R的方差，将保留下来的流量数据，作为一个流量观测序列X；101. Collect network traffic data within a time period T, and then for the original traffic sequence R in the network traffic data, the observed value at time t is represented by x _t , x _t ∈ R, t=1,2,...,T, Remove the unusable flow data value x _t according to |x _t |>kvar_R criterion, where k represents the Grubbs criterion coefficient, var_R represents the variance of the sequence R, and the retained flow data is used as a flow observation sequence X;

102、对流量观测序列X进行差分平稳化预处理，预处理得到的差分流量序列为D,其中差分值d_t＝x_t-x_t-1,t>1，d_t∈D,t＝1,2,…N，得到差分流量序列D后，输入步骤103中；102. Perform differential stabilization preprocessing on the flow observation sequence X, and the differential flow sequence obtained by preprocessing is D, where the difference value d _t = x _t -x _t-1 , t>1, d _t ∈ D, t=1 ,2,...N, after obtaining the differential flow sequence D, input it in step 103;

103、分别计算出序列X和序列D的平均值和方差，并根据平均值和方差，预估t时刻的差分流量值所在的区间[l_t,h_t]，其中p_t表示t时刻的阈值预测值，l_t和h_t分别表示在t时刻允许的差分流量的最小值和最大值，var_d_t表示t时刻差分流量的方差，在在检测到步骤102中的差分流量序列D输入后，防火墙即开启初级检测防御功能，对传送过来的数据，根据t时刻的阈值预测值p_t进行检测，当t时刻的差分流量数据值在差分流量预测值的区间[l_t,h_t]范围内时，判定其为正常流量，并将其转发给服务器；当超出区间[l_t,h_t]范围时，判定为异常流量，跳转至步骤104；103. Calculate the average value and variance of sequence X and sequence D respectively, and estimate the interval [l _t , h _t ] of the differential flow value at time t according to the average value and variance, Among them, p _t represents the threshold predicted value at time t, l _t and h _t represent the minimum and maximum values of differential flow allowed at time t, respectively, and var_d _t represents the variance of differential flow at time t, which is detected in step 102 After the differential traffic sequence D is input, the firewall starts the primary detection and defense function, and detects the transmitted data according to the threshold prediction value p _t at time t. When the differential traffic data value at time t is within the range of the differential traffic prediction value [l When it is within the range of [l _t , h _t ], it is judged to be normal traffic, and it is forwarded to the server; when it exceeds the range of [l _t , h _t ], it is judged to be abnormal traffic, and jumps to step 104;

104、防火墙的多级检测系统对转发过来的数据包进行分解，提取数据包中的关键字段key_field，并对这些关键字段key_field进行判定，若没有发现异常字段，则将其转发给服务器；若检测到异常字段，则将该数据包丢弃；104. The multi-level detection system of the firewall decomposes the forwarded data packet, extracts the key field key_field in the data packet, and judges these key fields key_field, and forwards it to the server if no abnormal field is found; If an abnormal field is detected, the packet is discarded;

105、经过步骤104中的再次检测后，将正常的数据包转发给服务器，使得服务器与客户端建立第一次握手连接；105. After re-detection in step 104, the normal data packet is forwarded to the server, so that the server and the client establish a handshake connection for the first time;

106、在建立了第一次握手连接后，服务器将会发送回复信息M_response给客户端，同时等待客户端的确认信息ACK，当客户端收到服务器的回复信息M_response后，两端建立了第二次握手连接；当服务器收到了确认信息ACK之后，服务器与客户端的建立了第三次握手连接，两者之间即可通信。106. After the first handshake connection is established, the server will send the reply message M _response to the client, and at the same time wait for the confirmation message ACK from the client. When the client receives the reply message M _response from the server, the two ends establish the first The second handshake connection; when the server receives the confirmation message ACK, the server and the client establish a third handshake connection, and the two can communicate.

进一步的，步骤102中所述的差分平稳化预处理的步骤为：Further, the steps of differential smoothing preprocessing described in step 102 are:

S21、对所收集时间段T内的网络通信数据序列{x₁,x₂,…,x_T}，分析去除非正常值，保留正常值，这里，如果|x_t|>kvar_R则表示x_i非正常值，其中k表示的是格拉布斯准则系数，var_R表示所述序列R的方差；将保留下来的观测值序列作为观察序列X；S21. For the network communication data sequence {x ₁ ,x ₂ ,…,x _T } in the collected time period T, analyze and remove abnormal values and keep normal values. Here, if |x _t |>kvar_R, it means x _i Abnormal value, where k represents the Grubbs criterion coefficient, and var_R represents the variance of the sequence R; the reserved observation sequence is used as the observation sequence X;

S22：对所述的原始序列R计算其平均值和其方差var_R；S22: Calculate the average value of the original sequence R and its variance var_R;

S23：对所述的观测序列X，进行差分预处理，有d_t∈D,t＝1,2,…T，其中d_t＝x_t-x_t-1,t>1；S23: Perform differential preprocessing on the observation sequence X, d _t ∈ D, t=1,2,...T, where d _t =x _t -x _t-1 , t>1;

S24：对所述差分序列D，计算其平均值和方差var_D。S24: For the difference sequence D, calculate its average value and variance var_D.

进一步的，原始序列R的平均值公式为：var_R表示它们的方差， Further, the average value of the original sequence R The formula is: var_R represents their variance,

本发明的优点及有益效果如下：Advantage of the present invention and beneficial effect are as follows:

本发明使用一种基于TCP协议的高效多级异常流量检测方法对流量数据进行多次检测和判定。由于现有的异常检测技术在检测的过程中，检测的误报率较高，使得检测的准确度受到了很大的影响。为了保证检测的准确度和网络通信的安全性。本文在提出了多级异常检测的方法。首先，在线下基于统计学，利用自适应阈值的方法对下一个时刻的流量所在区间进行估计；然后根据这个预估的流量区间，在线上进行检测判定；当判定结果为异常时，再进行多级检测，多级检测过程中，提取数据包中的关键信息，并进行判定。这种多级检测的机制，有效的降低了异常检测的误报率。本发明采用一种基于TCP协议的高效多级异常流量检测方法，利用差分均值方差的方法，得到趋势呈现平稳化的差分流量，该方法的计算复杂度较低，利用线下计算、线上检测的方式，使得检测速度快，并能够达到实时性检测的要求。同时，加入的多级检测机制，既达到了降低误报率的要求，也保障了网络通信安全。The invention uses a high-efficiency multi-stage abnormal flow detection method based on the TCP protocol to perform multiple detections and judgments on the flow data. Because the existing anomaly detection technology has a high false alarm rate during the detection process, the detection accuracy is greatly affected. In order to ensure the accuracy of detection and the security of network communication. In this paper, we propose a method for multi-level anomaly detection. First, based on offline statistics, the adaptive threshold method is used to estimate the traffic interval at the next moment; then, according to the estimated traffic interval, online detection and judgment are performed; when the judgment result is abnormal, multiple In the process of multi-level detection, the key information in the data packet is extracted and judged. This multi-level detection mechanism effectively reduces the false positive rate of anomaly detection. The present invention adopts a high-efficiency multi-level abnormal flow detection method based on the TCP protocol, and uses the method of differential mean variance to obtain a differential flow with a stable trend. The calculation complexity of the method is low, and offline calculation and online detection are used. The method makes the detection speed fast and can meet the requirements of real-time detection. At the same time, the added multi-level detection mechanism not only meets the requirements of reducing the false alarm rate, but also ensures the security of network communication.

附图说明Description of drawings

图1是本发明的多级异常检测流程示意图；Fig. 1 is a schematic diagram of a multi-level anomaly detection process of the present invention;

图2是本发明流量数据筛选示意图；Fig. 2 is a schematic diagram of flow data screening in the present invention;

图3是本发明的序列平稳化示意图；Fig. 3 is a schematic diagram of sequence stabilization of the present invention;

图4是本发明的自适应阈值计算原理图；Fig. 4 is a schematic diagram of the adaptive threshold calculation of the present invention;

图5是本发明的多级异常检测原理图。Fig. 5 is a schematic diagram of the multi-level anomaly detection of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整的描述。显然，所描述的实施例仅仅是本发明的一个实施例，而不是全部的实施例。The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Apparently, the described embodiment is only one embodiment of the present invention, not all of them.

图1是本发明的多级异常检测流程示意图。本发明，是基于互联网通信的TCP三次握手连接协议提出的。在通信过程中，客户端提出请求，请求访问服务器，建立连接。当访问正常时，访问的流程为如下步骤：FIG. 1 is a schematic diagram of the multi-level anomaly detection process of the present invention. The present invention is proposed based on the TCP three-way handshake connection protocol of Internet communication. During the communication process, the client makes a request, requests access to the server, and establishes a connection. When the access is normal, the access process is as follows:

S1：客户端发起请求，发送的请求数据包经过路由器转发给防火墙，防火墙在接收到请求数据包后，对流量进行统计，并进行初级检测，即差分流量检测；若检测到流量在差分流量预测值的区间[l_t,h_t]内，防火墙将判定该数据包为正常数据包，并将其转发给服务器；否则，该数据包将会被判定为异常数据包；此时，防火墙将开启进一步检测；在进一步异常检测中，对数据包进行分析，并提取关键字段key_field，对key_field做进一步检测，此时，若检测结果仍为异常，则做丢弃处理；若为正常，则标记为误判，并将其转发给服务器；此时，客户端与服务器将建立第一次握手连接；S1: The client initiates a request, and the sent request packet is forwarded to the firewall through the router. After receiving the request packet, the firewall collects statistics on the traffic and performs primary detection, that is, differential traffic detection; if the detected traffic is in the differential traffic prediction In the value range [l _t , h _t ], the firewall will determine that the data packet is a normal data packet and forward it to the server; otherwise, the data packet will be judged as an abnormal data packet; at this time, the firewall will open Further detection; in further abnormal detection, analyze the data packet, extract the key field key_field, and further detect the key_field. At this time, if the detection result is still abnormal, it will be discarded; if it is normal, it will be marked as Misjudgment, and forward it to the server; at this time, the client and the server will establish the first handshake connection;

S2：服务器在与客户端建立第一次连接的同时，发送一个回复信息M_response给客户端，并等待客户端的确认信息ACK，同时开启等待计时T_wait，当客户机接收到回复信息M_response时，第二次握手连接建立；S2: When the server establishes the first connection with the client, it sends a reply message M _response to the client, and waits for the confirmation message ACK from the client, and at the same time starts the waiting timer T _wait , when the client receives the reply message M _response , the second handshake connection is established;

S3：当服务器的等待时间超过最大等待时间，服务器将对该数据包做出丢弃处理；否则，当服务器收到客户端发送的确认信息ACK后，建立第三次握手连接。此时，双方即可进行通信。S3: When the waiting time of the server exceeds the maximum waiting time, the server will discard the data packet; otherwise, when the server receives the confirmation message ACK sent by the client, the third handshake connection is established. At this point, both parties can communicate.

图2是本发明的流量数据筛选示意图。收集一段T时间内的流量数据，采样数据间隔为1s，则收集的原始流量序列用R表示，在时刻t的观测值用x_t表示，其中x_t∈R,t＝1,2,…,T；用表示原始序列{x₁,x₂,…,x_T}的平均值，var_R表示它们的方差， Fig. 2 is a schematic diagram of flow data screening in the present invention. Collect flow data for a period of time T, and the sampling data interval is 1s, then the collected original flow sequence is denoted by R, and the observed value at time t is denoted by x _t , where x _t ∈ R, t=1,2,…, T; use Represents the average value of the original sequence {x ₁ ,x ₂ ,…,x _T }, var_R represents their variance,

对流量数据筛选前，观测序列依次对序列R中的每个值进行判定,当t时刻的值x_t满足|x_t|>kvar_R时，剔除x_t，其中k表示的是格拉布斯准则系数，否则观测序列X＝X∪x_t，其中x_t∈X,t＝1,2,…,N。Before filtering the traffic data, the observation sequence Determine each value in the sequence R in turn. When the value x t at time _t satisfies |x _t |>kvar_R, remove x _t , where k represents the Grubbs criterion coefficient, otherwise the observation sequence X=X∪ x _t , where x _t ∈ X,t=1,2,...,N.

图3是本发明的序列平稳化示意图。为了使原始序列的差分序列能够更好的反映数据波动的趋势，定义原始序列的差分序列为D，即对观测序列{x₁,x₂,…,x_N}做预处理，即用d_t表示t时刻的差分值，d_t＝x_t-x_t-1,t>1，d_t∈D,t＝1,2,…N。用表示差分序列D的平均值，并有则t时刻的差分均值即为用var_D表示差分序列D的方差，有当N→∞时，有由此可得出，差分流量趋于平稳化。Fig. 3 is a schematic diagram of sequence stabilization in the present invention. In order to make the differential sequence of the original sequence better reflect the trend of data fluctuations, the differential sequence of the original sequence is defined as D, that is, to preprocess the observation sequence {x ₁ ,x ₂ ,…,x _N }, that is, use d _t Indicates the difference value at time t, d _t =x _t -x _t-1 , t>1, d _t ∈ D, t=1,2,...N. use Denotes the average value of the difference sequence D, and has Then the difference mean at time t is Using var_D to represent the variance of the difference sequence D, there is When N→∞, there is It can be concluded that the differential flow tends to be stable.

图4是本发明的自适应阈值计算原理图。在进行线上实时的异常检测时，首先需要进行自适应阈值的计算。确定自适应阈值时，用l_t和h_t分别表示在t时刻允许的差分流量的最小值和最大值。本发明通过刷新机制来进行自适应阈值的计算，主要如下：Fig. 4 is a schematic diagram of the adaptive threshold calculation of the present invention. When performing online real-time anomaly detection, it is first necessary to calculate the adaptive threshold. When determining the adaptive threshold, use l _t and h _t to denote the minimum and maximum values of differential traffic allowed at time t, respectively. The present invention calculates the adaptive threshold through the refresh mechanism, which is mainly as follows:

通过叠加前一时刻的差分流量得到t时刻的阈值预测值p_t，其中α表示加权常数，主要是根据模型中发包主机数来确定，即控制新数据在模型中所占的比重，控制模型适应局部行为的快慢程度，从而建立了正常模型的刷新机制。若当前的观测值完全符合正常模型，那么认为此时的观测值是正常的。但是，由于实际情况难以符合理论模型，于是用观测值的标准差设定了一个置信区间，而根据加入的标准差的个数不同，得到的容差范围的级别也不相同，一般情况下采用的是标准差的2到3倍，由于已对数据进行了差分预处理，本文采用标准差的2倍来进行判定，所以得到的阈值范围为其中n表示客户机的数量。由此得到自适应阈值区间为[l_t,h_t]。The threshold prediction value p _t at time t is obtained by superimposing the differential flow at the previous time, Among them, α represents a weighting constant, which is mainly determined according to the number of hosts sending contracts in the model, that is, to control the proportion of new data in the model, and to control the speed at which the model adapts to local behaviors, thus establishing a normal model refresh mechanism. If the current observed value completely conforms to the normal model, then the observed value at this time is considered normal. However, since the actual situation is difficult to conform to the theoretical model, a confidence interval is set with the standard deviation of the observed value, and the level of the tolerance range is different according to the number of standard deviations added. is 2 to 3 times the standard deviation. Since the data has been differentially preprocessed, this paper uses 2 times the standard deviation for judgment, so the threshold range obtained is where n represents the number of clients. Thus, the adaptive threshold interval is [l _t , h _t ].

图5是本发明的多级异常检测原理图。客户端发出的请求数据包经过路由器转发至防火墙，防火墙在接收到数据包后，根据已有的观测序列进行自适应阈值区间的计算。主要的计算方式是首先进行第一次自适应阈值检测。该检测中，主要根据网络流量的统计数据，对下一个时刻的流量数据进行检测。若该检测中，判定数据包为正常，则直接将其转发给服务器；若判定为异常，则多级检测，进一步确认该数据包是否为异常包。Fig. 5 is a schematic diagram of the multi-level anomaly detection of the present invention. The request data packet sent by the client is forwarded to the firewall through the router. After receiving the data packet, the firewall calculates the adaptive threshold interval according to the existing observation sequence. The main calculation method is to first perform the first adaptive threshold detection. In this detection, the traffic data at the next moment is detected mainly according to the statistical data of the network traffic. If it is determined that the data packet is normal during the detection, it is directly forwarded to the server; if it is determined to be abnormal, multi-level detection is performed to further confirm whether the data packet is an abnormal packet.

进一步异常检测中，对数据包进行分析，并提取关键字段key_field。对key_field做进一步检测，若为正常，则标记为误判，并将其转发给服务器；若检测结果仍为异常，则做出丢弃处理的动作。同时处理排队队列Q中的下一个数据包。In further anomaly detection, the data packet is analyzed and the key field key_field is extracted. Further check the key_field, if it is normal, it will be marked as a misjudgment and forwarded to the server; if the test result is still abnormal, it will be discarded. Simultaneously process the next packet in queue Q.

以上这些实施例应理解为仅用于说明本发明而不用于限制本发明的保护范围。在阅读了本发明的记载的内容之后，技术人员可以对本发明作各种改动或修改，这些等效变化和修饰同样落入本发明权利要求所限定的范围。The above embodiments should be understood as only for illustrating the present invention but not for limiting the protection scope of the present invention. After reading the contents of the present invention, skilled persons can make various changes or modifications to the present invention, and these equivalent changes and modifications also fall within the scope defined by the claims of the present invention.

Claims

1. an efficient multi-level abnormal traffic detection method based on TCP protocol, is characterized in that, comprises the following steps:

101. Collect network traffic data within the time period T, and then for the original sequence R in the network traffic data, the observed value at time t is represented by x _t , x _t ∈ R, t=1,2,...,T, according to |x _t |＞kvar_R criterion to remove unavailable flow data value x _t , where k represents the Grubbs criterion coefficient, var_R represents the variance of the original sequence R, and the retained flow data is regarded as an observation sequence X;

102. Perform difference stabilization preprocessing on the observation sequence X, and the difference sequence obtained by preprocessing is D, where the difference value d _t = x _t -x _t-1 , t>1, d _t ∈ D, t=1,2 ,...N, after obtaining the differential sequence D, input it in step 103;

103. Calculate the average value and variance of the observation sequence X and the difference sequence D respectively, and estimate the interval [l _t , h _t ] of the differential flow value at time t according to the average value and variance, n represents the number of clients, where p _t represents the threshold prediction value at time t, l _t and h _t represent the minimum and maximum values of differential traffic allowed at time t, respectively, var_d _t represents the variance of differential traffic at time t, After detecting the input of the differential sequence D in step 102, the firewall starts the primary detection and defense function, and detects the transmitted data according to the threshold prediction value p _t at time t. When the predicted value is within the range [l _t , h _t ], it is judged to be normal traffic and forwarded to the server; when it exceeds the range [l _t , h _t ], it is judged to be abnormal traffic and skips to step 104;

104. The multi-level detection system of the firewall decomposes the forwarded data packet, extracts the key field key_field in the data packet, and judges these key fields key_field, and forwards it to the server if no abnormal field is found; If an abnormal field is detected, the packet is discarded;

105. After re-detection in step 104, the normal data packet is forwarded to the server, so that the server and the client establish a handshake connection for the first time;

106. After the first handshake connection is established, the server will send the reply message M _response to the client, and at the same time wait for the confirmation message ACK from the client. When the client receives the reply message M _response from the server, the two ends establish the first The second handshake connection; when the server receives the confirmation message ACK, the server and the client establish a third handshake connection, and the two can communicate.

2. the efficient multi-stage abnormal flow detection method based on TCP protocol according to claim 1, characterized in that, the step of differential stabilization preprocessing described in step 102 is:

S21. For the network communication data sequence {x ₁ , x ₂ ,…, x _T } within the collected time period T, analyze and remove abnormal values and keep normal values. Here, if |x _t |＞kvar_R means x _t Abnormal value, where k represents the Grubbs criterion coefficient, var_R represents the variance of the original sequence R; the reserved observation sequence is used as the observation sequence X;

S22: Calculate the average value of the original sequence R and its variance var_R;

S23: Perform differential preprocessing on the observation sequence X, there are d _t ∈ D, t=1,2,...T, where d _t =x _t -x _t-1 , t>1;

S24: For the difference sequence D, calculate its average value and variance var_D.

3. the efficient multi-stage abnormal traffic detection method based on TCP protocol according to claim 2, is characterized in that, the average value of original sequence R The formula is: var_R represents the variance of the original sequence,