CN118075030A

CN118075030A - Network attack detection method and device, electronic equipment and storage medium

Info

Publication number: CN118075030A
Application number: CN202410472119.2A
Authority: CN
Inventors: 梅阳阳; 韩伟红; 贾焰; 李树栋; 顾钊铨; 林凯瀚
Original assignee: Peng Cheng Laboratory
Current assignee: Peng Cheng Laboratory
Priority date: 2024-04-19
Filing date: 2024-04-19
Publication date: 2024-05-24
Anticipated expiration: 2044-04-19
Also published as: CN118075030B

Abstract

The disclosure provides a network attack detection method, a device, an electronic device and a storage medium, wherein the method comprises the following steps: acquiring a reference attack alarm sequence vector corresponding to a primary reference attack, and generating a basic attack alarm sequence vector based on the reference attack alarm sequence vector; generating a sample alert sequence vector based on the base attack alert sequence vector; inputting the sample alarm sequence vector into a sequential convolution network and a classifier which are sequentially connected to obtain a sample detection attack stage corresponding to the sample alarm sequence vector, and training the sequential convolution network and the classifier based on the sample detection attack stage; inputting the target attack alarm sequence vector into a trained time sequence convolution network and classifier to obtain an actual detection attack stage, and detecting network attack by using the actual detection attack stage. The embodiment of the disclosure can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic, auxiliary driving and the like.

Description

Network attack detection method and device, electronic equipment and storage medium

Technical Field

The disclosure relates to the technical field of network detection, and in particular relates to a network attack detection method, a network attack detection device, electronic equipment and a storage medium.

Background

The rapid evolution of network threats poses a serious challenge to information security. With the widespread use of digital technology, network attack techniques have become increasingly diverse and complex. For multi-stage continuous network attacks, such attacks typically employ advanced techniques, such as zero-day exploits, social engineering, directed attacks, etc., to bypass traditional security defense mechanisms, making detection and countermeasures particularly complex. For example, advanced persistent threat attacks (ADVANCED PERSISTENT THREAT, APT) are a highly specialized and covert form of attack that pose a significant threat to businesses and organizations through penetration of the target system over long periods of time.

Because the multi-stage continuous network attack actions have concealment and persistence, the dispersion among different attack actions is large, the dependence length of different attack stages is increased, and the complexity of arrangement is also increased. In identifying multi-stage continuous network attacks, it is necessary to identify relationships and dependencies between different attack stages. The network attack detection method in the related art cannot better solve the problem of long-term dependence of the multi-stage continuous network attack stage, so that the detection effect on the multi-stage continuous network attack is poor.

Disclosure of Invention

The present disclosure is directed to solving at least one of the technical problems existing in the prior art. Therefore, the present disclosure provides a network attack detection method, device, electronic device and storage medium, which can effectively detect a network attack with multi-stage continuity.

According to an aspect of the present disclosure, there is provided a network attack detection method, including:

Acquiring a reference attack alarm sequence vector corresponding to a reference attack, wherein each reference attack alarm associated with the reference attack alarm sequence vector indicates that one attack appearance of the reference attack is detected, and the reference attack comprises a first number of attack stages;

generating a basic attack alarm sequence vector based on the reference attack alarm sequence vector, wherein the basic attack alarm sequence vector comprises partial vector elements extracted from vector elements of reference attack alarms corresponding to a second number of attack phases in the reference attack alarm sequence vector, and the second number is smaller than the first number;

Generating a sample alert sequence vector based on the base attack alert sequence vector, the sample alert sequence vector comprising a sample attack alert sequence vector and a sample non-attack alert sequence vector;

Inputting the sample alarm sequence vector into a sequential convolution network and a classifier which are sequentially connected to obtain a sample detection attack stage corresponding to the sample alarm sequence vector, and training the sequential convolution network and the classifier based on the sample detection attack stage;

Inputting the target attack alarm sequence vector into the trained time sequence convolution network and the classifier to obtain an actual detection attack stage, and detecting the network attack by using the actual detection attack stage.

According to an aspect of the present disclosure, there is provided a network attack detection apparatus including:

the system comprises an acquisition module, a reference attack alarm sequence vector generation module and a reference attack detection module, wherein the acquisition module is used for acquiring a reference attack alarm sequence vector corresponding to a reference attack, each reference attack alarm associated with the reference attack alarm sequence vector indicates that one attack appearance of the reference attack is detected, and the reference attack comprises a first number of attack stages;

The first vector generation module is used for generating a basic attack alarm sequence vector based on the reference attack alarm sequence vector, wherein the basic attack alarm sequence vector comprises partial vector elements extracted from vector elements of reference attack alarms corresponding to the second number of attack phases in the reference attack alarm sequence vector, and the second number is smaller than the first number;

A second vector generation module for generating a sample alert sequence vector based on the base attack alert sequence vector, the sample alert sequence vector comprising a sample attack alert sequence vector and a sample non-attack alert sequence vector;

The training module is used for inputting the sample alarm sequence vector into a sequential convolution network and a classifier which are sequentially connected, obtaining a sample detection attack stage corresponding to the sample alarm sequence vector, and training the sequential convolution network and the classifier based on the sample detection attack stage;

the detection module is used for inputting the target attack alarm sequence vector into the trained time sequence convolution network and the classifier to obtain an actual detection attack stage, and detecting the network attack by utilizing the actual detection attack stage.

Optionally, the acquiring module is specifically configured to:

Acquiring a reference attack alarm sequence corresponding to a reference attack, wherein each reference attack alarm in the reference attack alarm sequence indicates that one attack appearance of the reference attack is detected;

The reference attack alert sequence is converted to the reference attack alert sequence vector.

Optionally, the first vector generation module is specifically configured to:

selecting a consecutive second number of attack phases from the first number of attack phases;

And extracting the partial vector elements from the vector elements of the reference attack alarms of each attack stage in the second number of attack stages in the reference attack alarm sequence vector based on a first proportion to form the basic attack alarm sequence vector.

Optionally, the network attack detection device further includes a second number determining module, where the second number determining module is configured to:

Acquiring an attack set to be detected, wherein the attack set to be detected comprises a plurality of attacks to be detected;

determining the number of attack stages included in each attack to be detected;

The second number is determined based on an average of the attack stage numbers.

Optionally, the network attack detection device further includes a first proportion determining module, where the first proportion determining module is configured to:

For each of the second number of attack phases, obtaining a third number of reference attack alarms in the attack phase;

For each of the second number of attack phases, obtaining a fourth number of reference attack alarms of the attack phases that is different from the previous reference attack alarm;

The first ratio is determined based on the third number and the fourth number.

Optionally, the second vector generation module is specifically configured to:

Inserting a first noise vector element between the vector elements of the reference attack alarm in every two adjacent attack phases in the basic attack alarm sequence vector, and inserting a second noise vector element before and after the basic attack alarm sequence vector to obtain a sample attack alarm sequence vector;

And replacing the vector elements of the reference attack alarm in each attack stage in the sample attack alarm sequence vector with random vector elements to obtain the sample non-attack alarm sequence vector.

Optionally, the training module is specifically configured to:

Inputting the sample alarm sequence vector into the time sequence convolution network to obtain a first stage characteristic vector corresponding to each attack stage;

And inputting the obtained first-stage feature vector corresponding to each attack stage into the classifier to obtain the sample detection attack stage corresponding to the sample alarm sequence vector.

Optionally, the time series convolution network comprises a first extended causal convolution layer, a channel attention model, a spatial attention model, a multiplier, a second extended causal convolution layer, and a residual sum layer; the training module comprises a first-stage feature vector generation sub-module for:

inputting the sample alarm sequence vector into the first extended causal convolution layer to obtain a first sample extended convolution vector;

inputting the first sample expansion convolution vector into the channel attention model to obtain a sample channel attention feature vector;

inputting the sample channel attention feature vector into the space attention model to obtain a sample space attention feature vector;

Multiplying the first sample spread convolution vector and the sample space attention feature vector by the multiplier to obtain a first sample multiplication vector;

inputting the first sample multiplication vector into a second extended causal convolution layer to obtain a second sample extended convolution vector;

And carrying out residual error sum processing on the sample alarm sequence vector and the second sample expansion convolution vector through the residual error sum layer to obtain a first stage characteristic vector corresponding to each attack stage.

Optionally, the time sequence convolution network further comprises a first weight normalization layer, a first correction linear unit and a first random discarding layer; the first stage feature vector generation submodule includes a channel attention feature vector generation unit configured to:

performing weight normalization operation on the first sample spread convolution vector through the first weight normalization layer to obtain a first weight normalization feature vector;

Nonlinear activation is carried out on the first weight normalized feature vector through the first correction linear unit, and a first activation feature vector is obtained;

Randomly discarding vector elements in the first activated feature vector through the first random discarding layer to obtain a first discarded feature vector;

and inputting the first discarded feature vector into the channel attention model to obtain the sample channel attention feature vector.

Optionally, the time sequence convolution network further comprises a second weight normalization layer, a second correction linear unit and a second random discarding layer; the first stage feature vector generation sub-module further includes a first stage feature vector generation unit configured to:

Performing renormalization operation on the second sample spread convolution vector through the second weight normalization layer to obtain a second weight normalization feature vector;

Nonlinear activation is carried out on the second weight normalized feature vector through the second correction linear unit, and a second activated feature vector is obtained;

Randomly discarding vector elements in the second activated feature vector through the second random discarding layer to obtain a second discarded feature vector;

And obtaining the first-stage feature vector corresponding to each attack stage by the residual error and the layer for the sample alarm sequence vector and the second discarded feature vector.

Optionally, the first stage feature vector generation sub-module further includes a channel attention model processing unit and a spatial attention model processing unit, where the channel attention model processing unit is configured to:

determining a channel attention coefficient based on the first sample spread convolution vector;

Determining the sample channel attention feature vector based on the first sample spread convolution vector and the channel attention coefficient;

The spatial attention model processing unit is used for:

determining a spatial attention coefficient based on the sample channel attention feature vector;

The sample spatial attention feature vector is determined based on the sample channel attention feature vector and the spatial attention coefficient.

Optionally, the training module further includes a loss function sub-module, where the loss function sub-module is configured to:

calculating a loss function based on the sample detection attack stage corresponding to the sample alarm sequence vector and a sample attack stage label;

training the time-series convolution network and the classifier based on the loss function.

According to an aspect of the present disclosure, there is provided an electronic device including a memory storing a computer program and a processor implementing a network attack detection method as described above when executing the computer program.

According to an aspect of the present disclosure, there is provided a computer-readable storage medium storing a computer program which, when executed by a processor, implements the network attack detection method as described above.

According to the embodiment of the disclosure, a reference attack alarm sequence vector corresponding to a reference attack is obtained, and a basic attack alarm sequence vector is generated based on the reference attack alarm sequence vector; the reference attack comprises a first number of attack stages, the basic attack alarm sequence vector comprises part of vector elements extracted from vector elements of reference attack alarms corresponding to a second number of attack stages in the reference attack alarm sequence vector, and the second number is smaller than the first number, so that reconstruction of the reference attack alarm sequence vector is completed, the generated basic attack alarm sequence vector accords with the situation when the actual network is attacked, and compared with the situation that all alarm sequence vectors of a certain attack stage are directly adopted as training samples in the related art, the method and the system have the advantage that the dependence among attack stages in the network attack is captured by a neural network model when training is carried out by using the sample alarm sequence vector generated by the basic attack alarm sequence vector. In addition, the neural network model adopts the sequential convolution network and the classifier which are sequentially connected, so that the accurate detection of multi-stage attack can be realized by extracting global and local time sequence information from the sample alarm sequence vector, a sample detection attack stage corresponding to the sample alarm sequence vector is obtained, and the sequential convolution network and the classifier are trained based on the sample detection attack stage, so that the trained sequential convolution network and classifier can effectively detect network attack of multi-stage continuity.

Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the disclosure. The objectives and other advantages of the disclosure will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The disclosure is further described below with reference to the drawings and examples, wherein:

FIG. 1 is a diagram of a system architecture to which a network attack detection method according to an embodiment of the present disclosure is applied;

FIG. 2 is a flow chart of a network attack detection method of an embodiment of the present disclosure;

FIG. 3 is a specific flowchart of step S210 in FIG. 2;

FIG. 4 is a schematic diagram of converting a reference attack alert sequence into a reference attack alert sequence vector according to an embodiment of the present disclosure;

FIG. 5 is a specific flowchart of step S220 in FIG. 2;

FIG. 6 is a schematic diagram of the present disclosure constituting a basic attack alert sequence vector;

FIG. 7 is a second number determination flow diagram of an embodiment of the present disclosure;

FIG. 8 is a schematic flow chart of determining a first scale according to an embodiment of the disclosure;

FIG. 9 is a schematic diagram showing a specific flow of step S230 in FIG. 2;

FIG. 10 is a schematic diagram of deriving a sample attack alert sequence vector and a sample non-attack alert sequence vector according to an embodiment of the present disclosure;

FIG. 11 is a schematic diagram showing a specific flow of step S240 in FIG. 2;

FIG. 12 is a schematic diagram of a structure of a time sequential convolution network and classifier according to an embodiment of the present disclosure;

FIG. 13 is a schematic diagram showing a specific flow of step S1110 in FIG. 11;

fig. 14 is a schematic diagram showing a specific flow of step S1320 in fig. 13;

fig. 15 is a schematic flowchart of step S1360 in fig. 13;

FIG. 16 is another flow chart of step S240 in FIG. 2;

FIG. 17 is a schematic diagram of a specific use procedure of a network attack detection method according to an embodiment of the present disclosure;

FIG. 18 is a schematic diagram of a network attack detection method according to an embodiment of the present disclosure;

fig. 19 is a schematic structural diagram of a network attack detection device according to an embodiment of the present disclosure;

FIG. 20 is a partial block diagram of a terminal implementing a network attack detection method according to an embodiment of the present disclosure;

fig. 21 is a partial block diagram of a server that performs a network attack detection method according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are exemplary only for explaining the present disclosure and are not to be construed as limiting the present disclosure.

In the description of the present disclosure, it should be understood that references to orientation descriptions, such as upper, lower, front, rear, left, right, etc., are based on the orientation or positional relationship shown in the drawings, are merely for convenience of describing the present disclosure and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present disclosure.

In the description of the present disclosure, the meaning of several is one or more, the meaning of plural is two or more, greater than, less than, exceeding, etc. are understood to not include the present number, and the above, below, within, etc. are understood to include the present number. The description of the first and second is for the purpose of distinguishing between technical features only and should not be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.

In the description of the present disclosure, unless explicitly defined otherwise, terms such as arrangement, mounting, connection, etc. should be construed broadly and the specific meaning of the terms in the present disclosure can be reasonably determined by a person skilled in the art in connection with the specific contents of the technical solution.

In the description of the present disclosure, a description referring to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present disclosure. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Before proceeding to further detailed description of the disclosed embodiments, the terms and terms involved in the disclosed embodiments are described, which are applicable to the following explanation:

Advanced persistent threat attack (ADVANCED PERSISTENT THREAT, APT): APT attacks are a long-term, persistent network attack behavior directed to a specific target. An attacker typically has explicit goals and motivations to gain sensitive information in the target network or to perform malicious operations through careful planning and continuous infiltration.

A time sequential convolution network (Temporal Convolutional Network, TCN): a time-series convolution network, which is specially used for processing time-series data. The network structure integrates modeling capability on a time domain and characteristic extraction capability of convolution under low parameter quantity, and an efficient and accurate method is provided for processing time sequence tasks. The convolution in the TCN architecture is a causal convolution, with as little loss of features as possible after the convolution, meaning that no information from the past is "leaked". The specific reason is that: TCN adopts one-dimensional full convolution (1-D Fully-Convolutional Network, FCN), and adds 0-padding to each layer for filling so as to realize equal input length and output length. To guarantee the causality of inputs to outputs, TCN introduces causal convolution (Causal Convolution), i.e. the output at time t can only be convolved from inputs from time t and earlier. But as the sequence length increases, it is necessary to increase the network depth or increase the convolution kernel, which can result in significant computational expense. Subsequently, TCN introduces an extended convolution (Dilated Convolutions) that reduces network depth as much as possible while not increasing parameters, and enlarges the receptive field (RECEPTIVE FIELD). Increasing the depth of the network may improve the performance of the network but may result in gradient explosions or gradient extinction.

Classifier (Classifier): is a computer program whose design goal is to be able to automatically classify data into known categories after automatic learning. The classifier is a generic term of a method for classifying samples in data mining, and comprises algorithms such as decision trees, logistic regression, naive Bayes, neural networks and the like.

Currently, the multi-stage detection method of the APT attack mainly comprises a method based on similarity and a method based on causal association to detect the attack behavior. The similarity-based method mainly realizes attack multi-stage identification through attribute matching, correlation calculation or scene clustering. For example: based on two different neural network methods (a multi-layer perceptron and a support vector machine), an attack strategy is automatically extracted from the intrusion alarm, the probability output based on the two methods shows the causal relationship of the front alarm and the rear alarm, and the association strength of the alarm is represented by an alarm association matrix. Another detection method uses the continuity features of the IP address, port, time interval, alarm type, etc. of the intrusion alarm to represent the correlation between attack steps by calculating the correlation of the contextual alarm features. Yet another detection method is to introduce a set of constraints of equations into the alarm correlation matrix to evaluate the similarity between two alarms.

Unlike similarity-based methods, causal correlation-based methods focus on analyzing multi-step sequences and causal relationships between their steps, either for statistical inference or model matching. Such as: a multi-step attack scene is detected by using a rule prediction method, the method discovers alarm combinations by using a multi-layer plot mining and filtering algorithm, the association strength between attack types in the attack scene is encoded by using an association matrix, and the multi-step attack is identified by learning the scale of different attack modes by a decision tree algorithm. The hidden Markov model (Hidden Markov Model, HMM) is also a model that is more used in multi-step attack recognition. Although these similarity-based detection methods show some effects to some extent, these schemes cannot effectively identify the dependency of the APT attack in the long-term phase.

The method can detect the APT attack, but has the defect of insufficient processing capacity for long sequences, can not better solve the problem of long-term dependence of the APT attack stage, and has great improvement space for performance. The method and the device can capture the dependence among various attack stages in the network attack, so that the APT attack can be effectively detected.

System architecture description of the application of embodiments of the present disclosure

Fig. 1 is a system architecture diagram to which a network attack detection method according to an embodiment of the present disclosure should be applied. It includes a terminal 110, a server 120, a network attack detection device 130, etc.

Terminal 110 is a device for obtaining a reference attack alert sequence vector corresponding to a reference attack, terminal 110 is deployed with a network attack detection system, which may be an intrusion detection system (Intrusion Detection System, IDS). The IDS is used to detect network attacks and generate attack alert text. The terminal 110 includes various forms of desktop computers, laptop computers, PDAs (personal digital assistants), cellular phones, vehicle-mounted terminals, dedicated terminals, and the like. In addition, the device can be a single device or a set of a plurality of devices. For example, a plurality of desktop computers are connected to each other via a lan, share a display, etc. to cooperate with each other, and together form a terminal 110. The terminal 110 may communicate with the internet 130 in a wired or wireless manner, exchanging data.

Server 120 refers to a computer system that can provide certain services to terminal 110. The server 120 is required to be higher in terms of stability, security, performance, etc. than the general terminal 110. The server 120 may be one high-performance computer in a network platform, a cluster of multiple high-performance computers, a portion of one high-performance computer (e.g., a virtual machine), a combination of portions of multiple high-performance computers (e.g., virtual machines), etc.

The network attack detection device 130 of the embodiment of the present disclosure is communicatively connected to the server 120, so that the relevant data of the terminal 110 is acquired through the server 120 to perform the network attack detection method of the present disclosure.

General description of embodiments of the disclosure

The network attack detection method of the embodiment of the present disclosure may be performed in the terminal 110; or may be executed at the server 120; or may be partially executed in the terminal 110 and partially executed in the server 120; or may be performed by the network attack detection device 130; part of the processing may be performed in the terminal 110, part of the processing may be performed in the server 120, and part of the processing may be performed in the network attack detection device 130.

Referring to fig. 2, fig. 2 is a flowchart of a network attack detection method according to an embodiment of the present disclosure. The network attack detection method of the embodiment of the present disclosure includes, but is not limited to, step S210 to step SS250;

step S210, a reference attack alarm sequence vector corresponding to a reference attack is obtained, each reference attack alarm associated with the reference attack alarm sequence vector indicates that an attack appearance of the reference attack is detected, and the reference attack comprises a first number of attack stages;

Step S220, based on the reference attack alarm sequence vector, generating a basic attack alarm sequence vector, wherein the basic attack alarm sequence vector comprises partial vector elements extracted from vector elements of reference attack alarms corresponding to a second number of attack phases in the reference attack alarm sequence vector, and the second number is smaller than the first number;

step S230, generating a sample alarm sequence vector based on the basic attack alarm sequence vector, wherein the sample alarm sequence vector comprises a sample attack alarm sequence vector and a sample non-attack alarm sequence vector;

Step S240, inputting the sample alarm sequence vector into a sequential convolution network and a classifier which are sequentially connected to obtain a sample detection attack stage corresponding to the sample alarm sequence vector, and training the sequential convolution network and the classifier based on the sample detection attack stage;

step S250, inputting the target attack alarm sequence vector into the trained time sequence convolution network and classifier to obtain an actual detection attack stage, and detecting the network attack by using the actual detection attack stage.

First, the above-described steps S210 to S250 will be briefly described.

The reference attack in step S210 is a network attack initiated by one known terminal 110 against another known terminal 110. The reference attack includes a first number of successive attack phases; when the reference attack is an APT attack, the first number is 5, and the reference attack includes consecutive 5 attack phases, respectively: a scout stage, a foothold establishment stage, a lateral movement stage, a message stealing or system destruction stage, a back door leaving stage or an attack trace clearing stage.

An Attack Surface refers to a collection of all entry points or vulnerabilities associated with an asset that may be exploited by an attacker. These entry points or vulnerabilities may be physical, digital, or human factors. The attack manifestations involve not only the external boundaries of the system or network, but also internal components, communication paths, user interfaces, data and business processes, etc. Thus, an attack surface is a broad concept that encompasses all possible attack paths and potential vulnerabilities.

An IDS system is deployed in the attacked terminal 110, and when the terminal 110 is attacked, the attacked terminal 110 can detect an attack appearance through the IDS system and generate an alert text according to the attack appearance. Since the reference attack is initiated by the known terminal 110, the attack duration of the reference attack can be set and the phase attack duration of each attack phase in the reference attack can be determined; therefore, in the attack duration of the reference attack, the attacked terminal 110 obtains all attack appearances in the duration (and can determine the attack appearance corresponding to each attack stage according to the attack duration of each attack stage) through the IDS system, generates an alarm text for each attack appearance, and ranks the attack appearances according to the generation time of each attack appearance to obtain the reference attack alarm sequence; since the attack appearance corresponding to each attack stage can be determined, in the reference attack alarm sequence, the alarm text corresponding to each attack stage can be determined. Since the reference attack alarm sequence cannot be directly used as an input of the neural network model, the reference attack alarm sequence needs to be converted into a sequence vector, so that the reference attack alarm sequence vector is obtained.

A vector is an array of values in different dimensions, which is a point in multidimensional space. The line segment between the point and the origin in the multi-dimensional coordinate system has a size and direction, which is the size and direction of the vector. Each numerical value is a point value, namely a vector element, projected by the point on a corresponding coordinate axis in a multi-dimensional coordinate system. The vector elements may be numerical values or symbols, but the vector elements of the vector as input to the model are generally numerical values. The specific method of converting the reference attack alarm sequence into the reference attack alarm sequence vector will be described in detail later.

In step S230, a basic attack alert sequence vector is generated based on the reference attack alert sequence vector, a second number of attack phases is determined from the reference attack, and then in the reference attack alert sequence vector, a vector element corresponding to each attack phase in the second number of attack phases is determined (since an alert text corresponding to each attack phase can be determined, in the reference attack alert sequence vector, a vector element corresponding to each attack phase can be determined). And extracting partial vector elements from the vector elements corresponding to each attack stage in the second number of attack stages to form a basic attack alarm sequence vector.

In one embodiment, the second number is 4, and the second number of attack phases includes a scout phase, a set up foothold phase, a lateral movement phase, a steal information or a destroy system phase. In one embodiment, the second number is 3, and the second number of attack phases includes a scout phase, a set-up foothold phase, and a lateral movement phase. In one embodiment, the second number is 2, and the second number of attack phases includes a scout phase, a set-up foothold phase.

Then generating a sample alarm sequence vector based on the basic attack alarm sequence vector, wherein the sample alarm sequence vector comprises a sample attack alarm sequence vector and a sample non-attack alarm sequence vector; and inputting the sample alarm sequence vector into a sequential convolution network and a classifier which are sequentially connected to obtain a sample detection attack stage corresponding to the sample alarm sequence vector, and training the sequential convolution network and the classifier based on the sample detection attack stage. The specific training process of the time series convolution network and classifier will be described in detail later.

After the training of the time sequence convolution network and the classifier is completed, the target attack alarm sequence vector is input into the trained time sequence convolution network and classifier to obtain an actual detection attack stage, and the actual detection attack stage is utilized to detect the network attack. Specifically, acquiring an alarm text generated according to an attack appearance in an IDS system in the terminal 110 within a period of time to obtain an actual alarm text sequence; the attack appearance may be caused by the unknown terminal 110 launching a network attack, or may be caused by other network activities (such as port detection, unintentionally clicking on an unknown website, password input errors during login, etc.). And then converting the alarm text sequence into a sequence vector so as to obtain a target attack alarm sequence vector, inputting the target attack alarm sequence vector into a trained time sequence convolution network and classifier so as to obtain an actual detection attack stage, and detecting network attack by using the actual detection attack stage.

Step S210 to step S250 are described above, a reference attack alarm sequence vector corresponding to a reference attack is obtained, and a basic attack alarm sequence vector is generated based on the reference attack alarm sequence vector; the reference attack comprises a first number of attack stages, the basic attack alarm sequence vector comprises part of vector elements extracted from vector elements of reference attack alarms corresponding to a second number of attack stages in the reference attack alarm sequence vector, and the second number is smaller than the first number, so that reconstruction of the reference attack alarm sequence vector is completed, the generated basic attack alarm sequence vector accords with the situation when the actual network is attacked, and compared with the situation that all alarm sequence vectors of a certain attack stage are directly adopted as training samples in the related art, the method and the system have the advantage that the dependence among attack stages in the network attack is captured by a neural network model when training is carried out by using the sample alarm sequence vector generated by the basic attack alarm sequence vector. In addition, the neural network model adopts the sequential convolution network and the classifier which are sequentially connected, so that the accurate detection of multi-stage attack can be realized by extracting global and local time sequence information from the sample alarm sequence vector, a sample detection attack stage corresponding to the sample alarm sequence vector is obtained, and the sequential convolution network and the classifier are trained based on the sample detection attack stage, so that the trained sequential convolution network and classifier can effectively detect network attack of multi-stage continuity.

The following describes step S210 to step S250 in detail.

Detailed description of step S210

Referring to fig. 3, fig. 3 is a specific flowchart of step S210 in fig. 2. In step S210, a reference attack alert sequence vector corresponding to a reference attack is obtained, including but not limited to steps S310 to S320;

Step S310, a reference attack alarm sequence corresponding to a reference attack is obtained, and each reference attack alarm in the reference attack alarm sequence indicates that an attack appearance of the reference attack is detected;

step S320 converts the reference attack alarm sequence into a reference attack alarm sequence vector.

The reference attack in step S310 is a known network attack, initiated by one known terminal 110 to another known terminal 110. The attacked terminal 110 is deployed with a network attack detection system (for example, an IDS system), and the attack duration network attack detection system for acquiring the reference attack acquires a reference attack alarm sequence according to the warning text generated by the attack appearance. For example, the attack duration of the reference attack is 3 months 1 day to 10 months 20 days, all warning texts generated by the network attack detection system according to the attack appearance are collected within 3 months 1 day to 10 months 20 days, and the warning texts are ranked according to the generation time of the warning texts, so that the reference attack alarm sequence is obtained. In an embodiment, for convenience in data collection, the duration of the reference attack may be set to 3 months 1 day to 3 months 3 days, so that all warning texts generated by the network attack detection system according to the attack appearance in 3 months 1 day to 3 months 3 days are collected, and the warning texts are ranked according to the generation time of the warning texts, so as to obtain the reference attack alarm sequence.

It should be noted that, besides the deployed network attack detection system, the terminal 110 may be an IDS system, an intrusion prevention system (Intrusion Prevention System, IPS) or a system information and event management system (System Information AND EVENT MANAGEMENT, SIEM). The IPS system is a network security device or software, can monitor network traffic in real time, detect and prevent potential malicious attacks, and can generate early warning and alarm information when the IPS system detects potential security events, namely, the IPS system can also generate alarm text according to attack appearances. The SIEM system aggregates log data, security alarms and events into a centralized platform, provides real-time analysis for security monitoring, and generates alarms when the SIEM system detects a potential security threat or policy violation event, i.e., the SIEM system can also generate alarm text according to the attack appearance.

In step S320, since the reference attack alarm sequence cannot be directly input to the neural network model, the reference attack alarm sequence needs to be converted into a sequence vector, thereby obtaining the reference attack alarm sequence vector.

In One embodiment, each element of the reference attack alert sequence is encoded with a One-bit significant code (One-Hot) to obtain a reference attack alert sequence vector. One-Hot encoding is often used to convert the classification variables into a representation of binary vectors. This first requires mapping the classification values to integer values, each of which is then represented as a binary vector. The rest of the vector is zero except for the index position of the integer (its value is 1). For example, the alert text of the element in the reference attack alert sequence has T types in total, the alert text of the nth element in the reference attack alert sequence has the type T, and the One-Hot code of the nth element in the reference attack alert sequence is: v1=0, v2=0,..vt=1,..vt=0 ]. In an embodiment, referring to fig. 4, fig. 4 is a schematic diagram of converting a reference attack alert sequence into a reference attack alert sequence vector according to an embodiment of the present disclosure. In the reference attack alarm sequence, the alarm texts are the same or the semantics of the alarm texts are similar and are set to be the same, in this embodiment, the alarm texts comprise three types in total, namely 'detection of port activity anomaly, possibly potential attack behavior', 'detection of unauthorized access attempt', 'detection of malicious file uploading behavior', and then the alarm texts are coded as 'detection of port activity anomaly, possibly potential attack behavior' by One-Hot coding, and the elements of the alarm texts are coded as [ v1=1, v2=0, v3=0 ]. An element of the alert text "unauthorized access attempt detected" is encoded as [ v1=0, v2=1, v3=0 ]. An element of alert text that "malicious file upload behavior is detected" is encoded as [ v1=0, v2=0, v3=1 ].

In one embodiment, each element of the reference attack alert sequence is encoded using virtual encoding (Dummy encoding) to obtain a reference attack alert sequence vector. Virtual coding is one coding scheme commonly used in data processing and machine learning. It is mainly used for processing classified variables, especially unordered classified variables. In virtual coding, k-1 new binary variables (or virtual, dummy variables) are generated for a class variable having k different values. Each new binary variable represents a specific value in the original classified variable, and when the value of the original variable is the same as the value represented by the binary variable, the value of the binary variable is 1, otherwise, the value of the binary variable is 0. In this way, the information of the original classification variables is converted into the form of a series of binary variables, so that the information can be processed by a plurality of machine learning algorithms. In this embodiment, in the reference attack alert sequence, the alert text is the same, or the semantics of the alert text are similar, and in this embodiment, the alert text includes three types in total, namely, "detecting port activity anomalies, which may be potential attack actions," "detecting unauthorized access attempts," and "detecting malicious file upload actions," respectively. Taking "port activity anomaly detected, possibly potential attack behavior" as a reference class, it is not necessary to create a virtual variable for it, create a virtual variable v1 for "unauthorized access attempt detected", take v1 value to 1 when the alert text is "unauthorized access attempt detected", otherwise v1 is 0. Creating a virtual variable v2 for the 'malicious file uploading behavior detected', wherein when the alarm text is the 'malicious file uploading behavior detected', the v2 takes a value of 1, otherwise, the v2 takes a value of 0. Thus, converting an element of alert text as "port activity anomaly detected, potentially aggressive behavior" into virtual code is: [ v1=0, v2=0 ]. Converting an element of alert text as "unauthorized access attempt detected" into a virtual code as: [ v1=1, v2=0 ]. Converting an element of the alert text as 'malicious file uploading behavior detected' into virtual codes as follows: [ v1=0, v2=1 ].

Detailed description of step S220

Referring to fig. 5, fig. 5 is a specific flowchart of step S220 in fig. 2. In step S220, a base attack alert sequence vector is generated based on the reference attack alert sequence vector, including but not limited to steps S510 to S520;

step S510, selecting a continuous second number of attack phases from the first number of attack phases;

Step S520, based on the first ratio, extracting a part of vector elements from vector elements of the reference attack alarm of each attack stage of the second number of attack stages in the reference attack alarm sequence vector, to form a basic attack alarm sequence vector.

In one embodiment, referring to FIG. 6, FIG. 6 is a schematic diagram of the present disclosure making up a base attack alert sequence vector. The reference attack is an APT attack, the first number is 5, and the reference attack alarm sequence vector includes 5 consecutive attack phases, which are respectively: a scout stage, a foothold establishment stage, a lateral movement stage, a message stealing or system destruction stage, a back door leaving stage or an attack trace clearing stage. The reference attack alarm sequence vector comprises 500 vector elements, namely a1, a2, a500, and vector elements corresponding to a reconnaissance stage are a1., a50; the vector element corresponding to the stage of establishing the foothold is a 51..a 200; the vector element corresponding to the lateral movement phase is a 201..a 286; the vector element corresponding to the steal information or destroy system phase is a 287..a 350; the vector element corresponding to the back gate or clear attack trace stage is left as a 351..a500. The second number is taken as 2 in this embodiment, and the second number of attack phases includes a scout phase and a foothold establishment phase. In this embodiment, the first ratio is 0.5, and the vector elements ranked in the first 50 percent are extracted from the vector elements corresponding to the reconnaissance stage, and the vector elements ranked in the first 50 percent are extracted from the vector elements corresponding to the foothold establishment stage, so as to form a basic attack alarm sequence vector, where the basic attack alarm sequence vector is [ s1, s2], where s1 is [ a1, a 2..a25 ], and s2 is [ a51, a52,..a 125].

Referring to fig. 7, fig. 7 is a flow chart of a second number determination of an embodiment of the present disclosure. The second number is determined by:

Step S710, obtaining an attack set to be detected, wherein the attack set to be detected comprises a plurality of attacks to be detected;

step S720, determining the number of attack stages included in each attack to be detected;

In step S730, a second number is determined based on the average of the attack stage numbers.

In one embodiment, the process of steps S210 to S240 is performed multiple times in order to convolve the network and the classifier with time sequence. In order to be able to execute steps S210 to S240 multiple times, multiple reference attacks need to be constructed in advance, that is, a set of attacks to be detected is constructed, where the set of attacks to be detected includes multiple attacks to be detected, each attack to be detected serves as a reference attack, and the number of attack stages included in different attacks to be detected may be the same or different. The average of the number of attack phases of all attacks to be detected is taken as the second number. For example, the attack set to be detected includes 10 attacks to be detected, and the number of attack stages included in the 10 attacks to be detected is 5,5,5,3,4,2,4,3,5,4 respectively; then taking the average of [5,5,5,3,4,2,4,3,5,4] as (5+5+5+3+4+2+4+3+5+4)/10=4, then determining the second number as 4. If the average is not an integer, only the bits of the average are taken as the second number. For example, when the attack set to be detected includes 10 attacks to be detected, the number of attack stages included in the 10 attacks to be detected is 5,5,5,3,4,2,2,3,5,4 respectively; the average number of [5,5,5,3,4,2,2,3,5,4] is (5+5+5+3+4+2+2+3+5+4)/10=3.8, and the second number is determined to be 3.

In an embodiment, the median of the number of attack phases of all attacks to be detected is taken as the second number. For example, when the attack set to be detected includes 10 attacks to be detected, the number of attack stages included in the 10 attacks to be detected is 5,5,5,3,4,2,2,3,5,4 respectively; then the median of [5,5,5,3,4,2,2,3,5,4] is 4 and thus the second number is determined to be 4.

In one embodiment, the weighted average of the number of attack phases of all attacks to be detected is taken as the second number, and when the weighted average is not an integer, the bits of the weighted average are taken as the second number. For example, the attack set to be detected includes 10 attacks to be detected, the number of attack stages included in the 10 attacks to be detected is 5,5,5,3,4,2,2,3,5,4, the weights of [5,5,5,3,4,2,2,3,5,4] are [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1], the calculated weighted average value is 3.58, and the second number is 3.

Determining the second number through steps S710-S730 has the advantage that the generated basic attack alert sequence vector can be made more realistic.

Referring to fig. 8, fig. 8 is a flow chart illustrating a first ratio determination according to an embodiment of the present disclosure.

The first ratio is determined by:

Step S810, for each of the second number of attack phases, obtaining a third number of reference attack alarms in the attack phase;

Step S820, for each attack stage in the second number of attack stages, obtaining a fourth number of reference attack alarms in the attack stages that is different from the previous reference attack alarm;

Step S830, a first ratio is determined based on the third number and the fourth number.

In step S810, since the reference attack is a known network attack and may be preset, a phase attack duration of each attack phase of the reference attack can be determined, so that a phase attack duration of each attack phase of the second number of attack phases can be determined. And determining the number of vector elements in a reference attack alarm sequence corresponding to the alarm text generated by the network attack detection system in the stage attack duration time of one attack stage, and taking the number as the third number of the reference attack alarms of the attack stage. For example, the reference attack includes 3 attack phases, the attack duration of the reference attack is 3 months 1 day to 3 months 10 days, the 3 attack phases are a first attack phase, a second attack phase and a third attack phase, the attack duration of the first attack phase is 3 months 1 day to 3 months 4 days, the attack duration of the second attack phase is 3 months 5 days to 3 months 7 days, and the attack duration of the third attack phase is 3 months 8 days to 3 months 10 days. The second number of attack phases including the first attack phase and the second attack phase described above, then determining an alert generated by the cyber attack detection system within 1 day 3 months to 4 days 3 months

The number of vector elements in the reference attack alarm sequence corresponding to the text is used as the third number of the reference attack alarms in the first attack stage; and determining the number of vector elements in a reference attack alarm sequence corresponding to the alarm text generated by the network attack detection system in 3 months 5 days to 3 months 7 days, and taking the number as the third number of the reference attack alarms in the second attack stage.

In an embodiment, after determining the third number, a fourth number of reference attack alarms, different from the previous reference attack alarm (i.e., vector element), in the attack stage is obtained for each of the second number of attack stages. For example, the third number of the first attack phases is 10, the vector elements in the reference attack alarm sequence corresponding to the first attack phase are [ a1, a2, a3, a4, a5, a6, a7, a8, a9, a10], wherein a1, a2, a3 are the same, a7, a8, a9, a10 are the same, a4 and a5 are the same, a3 and a4 are different, a5 and a6 are different, a7 are different, and a4, a6 and a7 are different reference attack alarms different from the previous reference attack alarm can be determined, and the fourth number of the first attack phases is 3. In this embodiment, the fourth number/third number is taken as the first ratio, and the first ratio of the first attack stage is 3/10=0.3. And in this embodiment, a reference attack alarm in the attack stage that is different from the previous reference attack alarm is taken as a partial vector element extracted from the attack stage, the partial vector element being used to compose the base attack alarm sequence vector. Specifically, for the first attack stage, a4, a6 and a7 are used as partial vector elements extracted in the first attack stage, and the partial vector elements are used for forming a basic attack alarm sequence vector.

In one embodiment, the first ratio of (fourth number-third number)/third number is taken as the first ratio, for example, the third number of the second attack stage is 100, and the fourth number is 40, then the first ratio of the second attack stage is (100-40)/100=0.6. In this embodiment, 60 percent of the reference attack alarms are randomly extracted from all the reference attack alarms (i.e., vector elements) of the second attack stage as part of the vector elements extracted from the second attack stage, which are used to compose the base attack alarm sequence vector.

Detailed description of step S230

Referring to fig. 9, fig. 9 is a schematic diagram showing a specific flow of step S230 in fig. 2; in step S230, a sample alert sequence vector is generated based on the base attack alert sequence vector, which may include, but is not limited to, step S910 and step S920.

Step S910, inserting a first noise vector element between vector elements of reference attack alarms in every two adjacent attack phases in the basic attack alarm sequence vector, and inserting a second noise vector element before and after the basic attack alarm sequence vector to obtain a sample attack alarm sequence vector;

and step S920, replacing vector elements of the reference attack alarm in each attack stage in the sample attack alarm sequence vector with random vector elements to obtain a sample non-attack alarm sequence vector.

The first noise vector element is a vector element that is not associated with the reference attack and the second noise vector element is a vector element that is also not associated with the reference attack. The first noise vector element and the second noise vector element may be the same or different. The first noise vector element is inserted between the vector elements of the reference attack alarm in every two adjacent attack phases in the basic attack alarm sequence vector, and the second noise vector element is inserted before and after the basic attack alarm sequence vector, so that the sample attack alarm sequence vector can be more attached to the situation in an actual network. Specifically, for example, the basic attack alarm sequence vector includes three attack phases, s1, s2 and s3 respectively represent vector elements in a reference attack alarm sequence vector included in each attack phase of the three attack phases, then the basic attack alarm sequence is [ s1, s2 and s3], then a first noise vector element is generated as f1, a second noise vector element is generated as f2, and the obtained sample non-attack alarm sequence vectors are [ f2, s1, f1, s2, f1, s3 and f2], and the two f1 can be identical or different. The two f2 may be the same or different.

In one embodiment, an attack representation unrelated to the reference attack is detected in the attacked terminal 110 by the cyber attack detection system, which generates an alert text based on the attack representation and then converts the alert text into a vector, which may be the first noise vector element or the second noise vector element. The attack appearance is independent of the reference attack and may be caused by other network activities such as port detection, inadvertent clicking on an unknown website, password entry errors at login, etc.

In an embodiment, since the reference attack is a known cyber attack, the attack duration of the reference attack can be determined, and alert text generated outside the attack duration by the cyber attack detection system in the attacked terminal 110 is collected, and the alert text is converted into a vector, which may be the first noise vector element or the second noise vector element. For example, if the attack duration of the reference attack is 3 months 1 day to 3 months 20 days, an alarm text generated by the cyber attack detection system in the attacked terminal 110 on 2 months 10 day to 2 months 20 day may be collected, and the alarm text may be converted into a vector as a first noise vector element; the alert text generated by the cyber attack detection system in the attacked terminal 110 on the 3 month 11 day to the 3 month 14 day may be collected, and converted into a vector as the second noise vector element.

In one embodiment, the first noise vector element and the second noise vector element are gaussian noise vectors, and when the gaussian noise vectors are generated, the mean value and the standard deviation of the gaussian noise are set first, and then a random number table or an online gaussian noise generator is used to generate a plurality of values based on the set mean value and standard deviation, and the values form the gaussian noise vectors. For example, the mean value and standard deviation of Gaussian noise are set to be 0 and 1 respectively, and then 10 numerical values are generated by adopting an online Gaussian noise generator and are respectively 0.2-0.5,1.3-0.1,0.8-0.3,0.7,0.1-0.9,0.4; the resulting gaussian noise vector is 0.2, -0.5,1.3, -0.1,0.8, -0.3,0.7,0.1, -0.9,0.4.

Referring to fig. 10, fig. 10 is a schematic diagram of deriving a sample attack alert sequence vector and a sample non-attack alert sequence vector according to an embodiment of the present disclosure. In fig. 10, the basic attack alert sequence vector includes two attack phases, a detection phase and a foothold establishment phase, respectively, the scout phase includes vector elements of [ a1, a2,..a25 ]; the set-up foothold stage includes vector elements [ a51, a 52..a 125]. The first noise vector [ c1, c2...cm ] is inserted between the detection phase and the set-up phase, and the second noise vector [ b1, b2...bn ] is inserted before the reconnaissance phase, and the second noise vector [ d1, d 2..dk ] is inserted after the set-up phase, thereby generating a sample attack alarm sequence vector. Then, based on the sample attack alarm sequence vector, a random vector element [ e1, e2...ej ] is used for replacing the reconnaissance stage [ a1, a2,..a 25], and a random vector element [ f1, f2...fh ] is used for replacing the foothold establishment stage [ a51, a 52..a 125], so that the sample non-attack alarm sequence vector is obtained.

Detailed description of step S240

Referring to fig. 11, fig. 11 is a specific flowchart of step S240 in fig. 2. In step S240, the sample alert sequence vector is input to a sequential convolution network and a classifier that are sequentially connected to obtain a sample detection attack stage corresponding to the sample alert sequence vector, including but not limited to step S1110 and step S1120.

Step S1110, inputting a sample alarm sequence vector into a time sequence convolution network to obtain a first stage feature vector corresponding to each attack stage;

step S1120, inputting the obtained first stage feature vector corresponding to each attack stage into a classifier to obtain a sample detection attack stage corresponding to the sample alarm sequence vector.

The method comprises the steps of inputting sample alarm sequence vectors into a time sequence convolution network, enabling the time sequence convolution network to capture global information of the sample alarm sequence vectors and also to capture local information of the sample alarm sequence vectors so as to achieve accurate detection of multiple attack phases, obtaining first-phase feature vectors corresponding to each attack phase, and inputting the obtained first-phase feature vectors corresponding to each attack phase into a classifier to obtain sample detection attack phases corresponding to the sample alarm sequence vectors.

In an embodiment, the classifier is used as a full connection layer, and before the classifier is input, the obtained first-stage feature vectors corresponding to each attack stage are subjected to global average pooling, and then input into the classifier. Global average pooling (Global Average Pooling) is a special pooling operation that is commonly used in convolutional neural networks to extract features. The method is mainly characterized in that the whole characteristic diagram is directly subjected to average pooling instead of setting a fixed pooling window size. This means that global averaging pooling averages the entire feature map for each channel on the feature map, thereby generating an aggregate feature value for each channel. The global average pooling of the first-stage feature vectors has the advantages that the information of the first-stage feature vectors is effectively compressed and represented, meanwhile, the problem of overfitting possibly caused by a full-connection layer is avoided, and each first-stage feature vector is directly related to the prediction probability of one category through the global average pooling, so that the interpretation of a model is improved.

In one embodiment, a Softmax function is used as the classifier. The Softmax function may convert the raw output to a probability distribution such that the output value for each class is between 0 and 1, and the sum of the probabilities for all classes is 1. The Softmax function is numerically more stable than other normalization methods, such as simple maximum normalization. When the input value is larger or smaller, the Softmax function can avoid the problem of numerical overflow or underflow, thereby ensuring the stability and accuracy of the model. The Softmax function is relatively simple to calculate and can be efficiently calculated in a vectorized manner. This enables a reduction in computation time and resource consumption when training large-scale datasets.

The time series convolution network includes a first extended causal convolution layer, a channel attention model, a spatial attention model, a multiplier, a second extended causal convolution layer, and a residual sum layer. The time sequence convolution network further comprises a first weight normalization layer, a first correction linear unit and a first random discarding layer. Referring to fig. 12, fig. 12 is a schematic diagram of a structure of a time-series convolution network and classifier according to an embodiment of the present disclosure. The time sequence convolution network comprises a plurality of residual blocks, wherein the residual blocks are sequentially connected, and each residual block comprises a first extended causal convolution layer, a channel attention model, a spatial attention model, a multiplier, a second extended causal convolution layer, a residual sum layer, a first weight normalization layer, a first correction linear unit and a first random discarding layer. And the coefficients of expansion of the first and second extended causal convolutional layers in the residual block are equal. The expansion coefficients corresponding to the sequentially connected residual blocks sequentially increase exponentially, for example, in fig. 12, 4 residual blocks are sequentially connected, and the expansion coefficients of the 4 residual blocks are 1,2,4, and 8, respectively. In fig. 12, (a) illustrates a temporal convolution network and a classifier simple structure, (b) illustrates a specific structure of a residual block, and (b) the global attention in (b) is composed of a channel attention model and a spatial attention model. (c) The specific structure of the extended causal convolutional layers is illustrated, d is the coefficient of expansion, and the structure can be a first extended causal convolutional layer or a second extended causal convolutional layer, and the first extended causal convolutional layer and the second extended causal convolutional layer have the same structure.

In a time-sequential convolutional network, the primary effect of the expansion coefficient is to increase the receptive field (RECEPTIVE FIELD) of the convolutional layer, i.e., the time frame of the input sequence that the network is able to capture. By applying different coefficients of expansion in the convolution operations of the different layers, the TCN can effectively expand the field of view of the network, enabling it to process longer sequence data. The expansion coefficient determines the spacing between adjacent elements in the convolution kernel. When the expansion coefficient is 1, the convolution kernel is conventional; when the expansion coefficient is greater than 1, the convolution kernel skips some elements in the input sequence for convolution operation. This design allows higher-level networks to capture longer time span information, helping to handle long-term dependencies in the sequence data. The exponential increase of the expansion coefficient of the embodiments of the present disclosure can ensure that the receptive field (i.e., the length of the input sequence that the network can cover) of the network can be rapidly expanded with the increase of the number of layers. This is important for processing long sequence data, as it ensures that the network can capture dependencies over a longer period of time, and thus the time-sequential convolution network of the present disclosure can capture dependencies between different attack phases in a reference attack.

In the extended causal convolutional layer, the sequences are padded by padding coefficients, which ensures that the output sequence has the same length as the input sequence and can expand the receptive field so that the convolutional kernel can capture a wider range of context information. For example, a fill factor=2 means that 2 units of fill value are added on each side (typically left and right) of the input sequence. These padding values are typically set to 0 and are therefore also referred to as zero-padding (zero-padding). If the length of the input sequence is L, the length of the sequence will become l+2×2=l+4 after the padding coefficient=2 is applied. In causal convolution, since the output of each time step can only depend on current and past information, if padding is not used, it may result in some of the input information being ignored during the convolution process. By adding padding it can be ensured that all information in the input sequence is taken into account by the convolution kernel, thereby avoiding information loss. Therefore, in the extended causal convolutional layer of the present disclosure, the sequence is filled by the filling coefficient, so that the information loss of the sample alarm sequence vector can be avoided.

Referring to fig. 13, fig. 13 is a specific flowchart of step S1110 in fig. 11. In step S1110, the sample alert sequence vector is input into the time sequence convolutional network to obtain a first stage feature vector corresponding to each attack stage, including:

Step S1310, inputting the sample alarm sequence vector into a first extended causal convolution layer to obtain a first sample extended convolution vector;

Step S1320, inputting the first sample expansion convolution vector into a channel attention model to obtain a sample channel attention feature vector;

Step S1330, input the sample channel attention feature vector into the space attention model to obtain the sample space attention feature vector;

Step S1340, multiplying the first sample spread convolution vector and the sample space attention feature vector by a multiplier to obtain a first sample multiplication vector;

Step S1350, inputting the first sample multiplication vector into the second spreading causal convolution layer to obtain a second sample spreading convolution vector;

Step S1360, performing residual error sum processing on the sample alarm sequence vector and the second sample spread convolution vector through the residual error sum layer to obtain a first stage feature vector corresponding to each attack stage.

In step S1310, the sample alert sequence vector is subjected to an extended causal convolution in the first extended causal convolution layer to obtain a first sample extended convolution vector, where the extended causal convolution combines causal convolution and extended convolution, and for one-dimensional data [ x0, x 1..the, xt+1], the extended convolution at time t has the following calculation formula:

F (t) represents the output of the extended causal convolution at time t; f represents a one-dimensional convolution kernel; m represents the expansion coefficient; k represents the size of the convolution kernel, f (i) represents the element of the convolution kernel at position i, Representing the elements of the corresponding sequence after the interval sampling.

Referring to fig. 14, fig. 14 is a specific flowchart of step S1320 in fig. 13. Step S1320, inputting the first sample spread convolution vector into the channel attention model to obtain a sample channel attention feature vector, including but not limited to the following steps:

step S1410, performing weight normalization operation on the first sample spread convolution vector through a first weight normalization layer to obtain a first weight normalized feature vector;

step S1420, performing nonlinear activation on the first weight normalized feature vector through a first correction linear unit to obtain a first activated feature vector;

Step S1430, carrying out random discarding on vector elements in the first activated feature vector through a first random discarding layer to obtain a first discarded feature vector;

In step S1440, the first discarded feature vector is input into the channel attention model to obtain the sample channel attention feature vector.

The weight normalization can enable each node to have the same contribution to the model, so that training efficiency and accuracy of the neural network are improved. This is because normalization can make the numerical difference between different features small, thereby making the convergence speed of the gradient descent algorithm fast. In the time sequence convolution network, the input sample alarm sequence vector is time sequence data, so that the first sample expansion convolution vector is also time sequence data, and the characteristic values of different time steps can have larger difference due to the processing of the time sequence data, and the characteristic values can have more equal positions in a model through weight normalization, so that the performance of the time sequence convolution network is improved.

The nonlinear activation function can introduce nonlinear factors, so that the neural network has the ability of learning and fitting complex functions. Without a nonlinear activation function, the neural network can only compute a linear function, no matter how many layers it has, which greatly limits the expressive power of the neural network. The expression capability of the time sequence convolution network can be improved by carrying out nonlinear activation on the first weight normalization feature vector through the first correction linear unit.

Step S1440 may include, but is not limited to, the following steps:

A sample channel attention feature vector is determined based on the first sample spread convolution vector and the channel attention coefficient.

Specifically, step S1410 to step S1430 are performed on the first sample spread convolution vector to obtain a first post-discard feature vector, the channel attention processing is performed on the first post-discard feature through the channel attention model to obtain a channel attention coefficient, and then the first post-discard feature vector is multiplied by the channel attention coefficient to obtain a sample channel attention feature vector. The processing procedure is expressed as:

；

Wherein Mc (F ₁) represents a channel attention coefficient obtained by performing channel attention processing on the first discarded feature vector; f ₁ denotes the first post-discard feature vector; f ₂ denotes the sample channel attention feature vector.

Step S1330 may include, but is not limited to, the following steps:

a sample spatial attention feature vector is determined based on the sample channel attention feature vector and the spatial attention coefficients.

Specifically, the spatial attention model is used for carrying out spatial attention processing on the sample channel attention feature vector to obtain a spatial attention coefficient, and then the spatial attention coefficient is multiplied with the sample channel attention feature vector to obtain the sample spatial attention feature vector. The process can be expressed as:

；

Wherein Ms (F ₂) represents a spatial attention coefficient obtained by performing spatial attention processing on the sample channel attention feature vector; f ₂ denotes the attention feature vector to the sample channel; f ₃ denotes the sample space attention feature vector.

After obtaining the sample space attention feature vector, multiplying the first sample spread convolution vector and the sample space attention feature vector by a multiplier to obtain a first sample multiplication vector; inputting the first sample multiplication vector into a second extended causal convolution layer to obtain a second sample extended convolution vector; and carrying out residual error sum processing on the sample alarm sequence vector and the second sample expansion convolution vector through the residual error sum layer to obtain a first-stage characteristic vector corresponding to each attack stage.

The time sequence convolution network further comprises a second weight normalization layer, a second correction linear unit and a second random discarding layer; specifically, referring to fig. 12, the residual block further includes a second weight normalization layer, a second correction linear unit, and a second random discard layer.

Referring to fig. 15, fig. 15 is a specific flowchart of step S1360 in fig. 13. Step S1360 performs residual error sum processing on the sample alert sequence vector and the second sample spread convolution vector through residual error sum layers to obtain a first stage feature vector corresponding to each attack stage, including but not limited to the following steps:

Step S1510, performing renormalization operation on the second sample spread convolution vector through a second weight normalization layer to obtain a second weight normalized feature vector;

step S1520, performing nonlinear activation on the second weight normalized feature vector by a second correction linear unit to obtain a second activated feature vector;

Step S1530, the vector elements in the second activated feature vector are randomly discarded through the second random discarding layer, so as to obtain a second discarded feature vector;

in step S1540, the first stage feature vector corresponding to each attack stage is obtained by residual error and layer for the sample alarm sequence vector and the second discarded feature vector.

The second weight normalization layer may be the same as the first weight normalization layer; the second modified linear unit may be identical to the first modified linear unit. The second random discard layer may be the same as the first random discard layer.

In a time-series convolutional network, as the number of layers of the network increases, gradients may fade or explode during the back propagation, resulting in a network that is difficult to train. The residual connection makes the gradient easier to pass through the whole network in the propagation process by creating a direct path from input to output, thereby effectively relieving the problems of gradient disappearance and gradient explosion. And the model can converge to the optimal solution faster during training, since the residual connection allows the gradient to pass back more directly to the previous layers. This greatly improves training efficiency and reduces training time. Therefore, in the embodiment of the disclosure, through the residual error and the layer, the sample alarm sequence vector is subjected to 1*1 convolutions, and then the sample alarm sequence vector subjected to 1*1 convolutions and the second discarded feature vector are subjected to residual error and processing, so that the first stage feature vector corresponding to each attack stage is obtained, the problems of gradient disappearance and gradient explosion in a time sequence convolution network can be effectively relieved, the training efficiency can be improved, and the training time is shortened.

Referring to fig. 12, in one residual block, a first extended causal convolutional layer, a first weight normalization layer, a first correction linear unit, and a first random discard layer are used as a first sub-block, and a second extended causal convolutional layer, a second weight normalization layer, a second correction linear unit, and a second random discard layer are used as a second sub-block. The output result of the residual block is:

；

wherein, Representing the output of the j-th sub-block in the k-th residual block,/>Representing the output of the j sub-block in the k residual block at time t,/>; K is the serial number of the residual block; h is the length of the sequence. f (i) represents an element of the convolution kernel at a position i; m represents the expansion coefficient. Notably, the attention time convolution network model in fig. 12 is equivalent to a time-sequential convolution network.

Referring to fig. 16, fig. 16 is another flow chart of step S240 in fig. 2. In step S240, the time series convolutional network and classifier are trained based on the sample detection attack phase, including, but not limited to, the following steps:

Step S1610, calculating a loss function based on the sample detection attack stage corresponding to the sample alarm sequence vector and the sample attack stage label;

step S1620, training the time sequence convolution network and the classifier based on the loss function.

After the sample alarm sequence vector is obtained, the sample alarm sequence vector is marked to obtain a sample attack stage label corresponding to the sample alarm sequence vector, for example, the sample attack alarm sequence vector is [ SN, S1, SN, S2, SN, ], SN, S4, SN ], the sample attack alarm sequence vector comprises 4 attack stages, respectively S1, S2, S3, S4, and the sample attack stage label corresponding to the sample attack alarm sequence vector is APT ₄, where SN represents a first noise vector element or a second noise vector element. And the label of the sample non-attack alarm sequence vector is NAPT. And calculating a loss function based on the sample detection attack stage corresponding to the sample alarm sequence vector and the sample attack stage label. For n samples { (x 0, y 0), (x 1, y 1), (xn, yn) }, the loss function employs a cross entropy loss function, specifically:

；

wherein, Is a trainable parameter of the model; /(I)Representing the category to which it belongs; n is the total category number. And calculating a loss value based on the loss function, and optimizing relevant parameters of the sequential convolution network and the classifier according to the loss value. And repeatedly executing the steps S210 to S240 until the loss value converges or the repeated execution times reach a preset value, so as to obtain the trained time sequence convolution network and classifier.

Detailed description of step S250

And after the training of the time sequence convolution network and the classifier is completed, obtaining the trained time sequence convolution network and the classifier. The trained time series convolution network and classifier can be applied to practice. The method comprises the steps of collecting alarm texts generated by a network workpiece detection system (such as an IDS system) in a terminal 110 within a period of time, sequencing the alarm texts according to generation time to obtain a target attack alarm sequence, converting each element in the target attack alarm sequence into a vector, converting each element in the target attack alarm sequence into the vector by adopting one-bit effective coding or virtual-pseudo coding to obtain a target attack alarm sequence, inputting the target attack alarm sequence vector into a trained time sequence convolution network and classifier to obtain an actual detection attack stage, and detecting the network attack by utilizing the actual detection attack stage. It is noted that the processing procedure of the target attack alarm sequence vector in the time sequence convolution network and the classifier is the same as the processing procedure of the sample alarm sequence vector in the time sequence convolution network and the classifier, and will not be described herein.

Specific use procedure examples of the network attack detection method of the embodiment of the present disclosure

Referring to fig. 17 and 18, fig. 17 is a schematic diagram illustrating a specific use procedure of the network attack detection method according to the embodiment of the disclosure. Fig. 18 is a schematic diagram of a network attack detection method according to an embodiment of the disclosure. Next, a specific use procedure of the network attack detection method according to the embodiment of the present disclosure, including but not limited to the following steps 1701 to 1715, will be described in detail with reference to fig. 17. In this process, a description will be given of a part of tasks each of the terminal 110 and the server 120 takes on in the network detection attack method, but it should be understood by those skilled in the art that the process may be independently completed by the terminal 110 alone or by the server 120 alone. In this process, the reference attack is an APT attack.

Step 1701, obtaining a reference attack alarm sequence corresponding to a reference attack, wherein the reference attack comprises a first number of attack phases;

step 1702, converting the reference attack alarm sequence into a reference attack alarm sequence vector;

step 1703, selecting a consecutive second number of attack phases from the first number of attack phases;

Step 1704, extracting a part of vector elements from vector elements of the reference attack alarms of each attack stage in the second number of attack stages based on the first proportion, and forming a basic attack alarm sequence vector;

Step 1705, inserting a first noise vector element between vector elements of reference attack alarms of every two adjacent attack phases in the basic attack alarm sequence vector, and inserting a second noise vector element before and after the basic attack alarm sequence vector to obtain a sample attack alarm sequence vector;

step 1706, replacing vector elements of the reference attack alarm in each attack stage in the sample attack alarm sequence vector with random vector elements to obtain a sample non-attack alarm sequence vector;

Step 1707, inputting the sample alarm sequence vector into a first extended causal convolution layer to obtain a first sample extended convolution vector;

Step 1708, inputting the first sample spread convolution vector into a channel attention model to obtain a sample channel attention feature vector;

Step 1709, inputting the sample channel attention feature vector into a spatial attention model to obtain a sample spatial attention feature vector;

step 1710, multiplying the first sample spread convolution vector and the sample space attention feature vector by a multiplier to obtain a first sample multiplication vector;

Step 1711, inputting the first sample multiplication vector into a second spreading causal convolution layer to obtain a second sample spreading convolution vector;

Step 1712, performing residual error sum processing on the sample alarm sequence vector and the second sample spread convolution vector through the residual error sum layer to obtain a first stage feature vector corresponding to each attack stage;

Step 1713, calculating a loss function based on the sample detection attack stage corresponding to the sample alarm sequence vector and the sample attack stage label;

Step 1714, training a time sequence convolution network and a classifier based on the loss function;

and 1715, inputting the target attack alarm sequence vector into the trained time sequence convolution network and classifier to obtain an actual detection attack stage, and detecting the network attack by using the actual detection attack stage.

In fig. 18, the network attack detection method according to the embodiment of the present disclosure is divided into two phases, namely, an APT attack sample reconstruction phase and an APT attack multi-phase identification phase. The APT attack sample reconstruction phase corresponds to steps 1701 to 1706 in fig. 17. The APT attack multi-stage recognition phase corresponds to steps 1707 to 1715 in fig. 17.

The IDS alert data set corresponds to step 1701 in fig. 17; alarm embedding corresponds to step 1702; the sequence construction corresponds to steps 1702 to 1706; and in the sequence identification, marking the sample attack alarm sequence vector and the sample non-attack alarm sequence vector so as to obtain an APT attack identification sequence sample and a non-attack identification sequence sample. And taking one part of the APT attack identification sequence sample and the non-attack identification sequence sample as training samples, and the other part as test samples. The global attention TCN model, i.e. the time series convolution network and the classifier, inputs training samples into the time series convolution network and the classifier, corresponding to steps 1707 to 1714, and finally inputs test samples into the trained time series convolution network and classifier, corresponding to step 1715.

Apparatus and device descriptions of embodiments of the present disclosure

It will be appreciated that, although the steps in the various flowcharts described above are shown in succession in the order indicated by the arrows, the steps are not necessarily executed in the order indicated by the arrows. The steps are not strictly limited in order unless explicitly stated in the present embodiment, and may be performed in other orders. Moreover, at least some of the steps in the flowcharts described above may include a plurality of steps or stages that are not necessarily performed at the same time but may be performed at different times, and the order of execution of the steps or stages is not necessarily sequential, but may be performed in turn or alternately with at least a portion of the steps or stages in other steps or other steps.

Referring to fig. 19, fig. 19 is a schematic structural diagram of a network attack detection device according to an embodiment of the present disclosure. The network attack detection device 130 includes:

An obtaining module 1910, configured to obtain a reference attack alarm sequence vector corresponding to a reference attack, where each reference attack alarm associated with the reference attack alarm sequence vector indicates that an attack appearance of the reference attack is detected, and the reference attack includes a first number of attack phases;

A first vector generation module 1920, configured to generate a base attack alert sequence vector based on the reference attack alert sequence vector, where the base attack alert sequence vector includes a portion of vector elements extracted from vector elements of the reference attack alert corresponding to a second number of attack phases in the reference attack alert sequence vector, and the second number is less than the first number;

A second vector generation module 1930 for generating sample alert sequence vectors based on the base attack alert sequence vectors, the sample alert sequence vectors comprising sample attack alert sequence vectors and sample non-attack alert sequence vectors;

The training module 1940 is configured to input the sample alarm sequence vector into a sequential convolution network and a classifier that are sequentially connected to obtain a sample detection attack stage corresponding to the sample alarm sequence vector, and train the sequential convolution network and the classifier based on the sample detection attack stage;

The detection module 1950 is configured to input the target attack alarm sequence vector into the trained time sequence convolutional network and classifier, obtain an actual detection attack stage, and detect a network attack by using the actual detection attack stage.

Optionally, the acquiring module 1910 is specifically configured to:

Acquiring a reference attack alarm sequence corresponding to a reference attack, wherein each reference attack alarm in the reference attack alarm sequence indicates that an attack appearance of the reference attack is detected;

the reference attack alert sequence is converted to a reference attack alert sequence vector.

Optionally, the first vector generation module 1920 is specifically configured to:

selecting a continuous second number of attack phases from the first number of attack phases;

Based on the first proportion, extracting partial vector elements from vector elements of the reference attack alarms of each attack stage in the second number of attack stages in the reference attack alarm sequence vector to form a basic attack alarm sequence vector.

Optionally, the network attack detection device further includes a second number determining module (not shown), where the second number determining module is configured to:

Determining the number of attack stages included in each attack to be detected;

The second number is determined based on an average of the number of attack phases.

Optionally, the network attack detection device further includes a first scale determining module (not shown), where the first scale determining module is configured to:

for each of a second number of attack phases, obtaining a fourth number of reference attack alarms in the attack phases that is different from the previous reference attack alarm;

the first ratio is determined based on the third number and the fourth number.

Optionally, the second vector generation module 1930 is specifically configured to:

Inserting a first noise vector element between vector elements of reference attack alarms of every two adjacent attack stages in the basic attack alarm sequence vector, and inserting a second noise vector element before and after the basic attack alarm sequence vector to obtain a sample attack alarm sequence vector;

And replacing vector elements of the reference attack alarm in each attack stage in the sample attack alarm sequence vector with random vector elements to obtain a sample non-attack alarm sequence vector.

Optionally, the training module 1940 is specifically configured to:

inputting the sample alarm sequence vector into a time sequence convolution network to obtain a first stage characteristic vector corresponding to each attack stage;

and inputting the obtained first-stage feature vector corresponding to each attack stage into a classifier to obtain a sample detection attack stage corresponding to the sample alarm sequence vector.

Optionally, the time series convolution network comprises a first extended causal convolution layer, a channel attention model, a spatial attention model, a multiplier, a second extended causal convolution layer, and a residual sum layer; the training module includes a first stage feature vector generation sub-module (not shown) for:

inputting the sample alarm sequence vector into a first extended causal convolution layer to obtain a first sample extended convolution vector;

Inputting the first sample expansion convolution vector into a channel attention model to obtain a sample channel attention feature vector;

inputting the sample channel attention feature vector into a space attention model to obtain a sample space attention feature vector;

Multiplying the first sample spread convolution vector and the sample space attention feature vector by a multiplier to obtain a first sample multiplication vector;

and carrying out residual error sum processing on the sample alarm sequence vector and the second sample expansion convolution vector through the residual error sum layer to obtain a first-stage characteristic vector corresponding to each attack stage.

Optionally, the time sequence convolution network further comprises a first weight normalization layer, a first correction linear unit and a first random discarding layer; the first stage feature vector generation submodule includes a channel attention feature vector generation unit (not shown) for:

performing weight normalization operation on the first sample spread convolution vector through a first weight normalization layer to obtain a first weight normalization feature vector;

Nonlinear activation is carried out on the first weight normalized feature vector through a first correction linear unit, so that a first activated feature vector is obtained;

randomly discarding vector elements in the first activated feature vector through a first random discarding layer to obtain a first discarded feature vector;

and inputting the first discarded feature vector into a channel attention model to obtain a sample channel attention feature vector.

Optionally, the time sequence convolution network further comprises a second weight normalization layer, a second correction linear unit and a second random discarding layer; the first-stage feature vector generation sub-module further includes a first-stage feature vector generation unit (not shown) for:

Carrying out renormalization operation on the second sample expansion convolution vector through a second weight normalization layer to obtain a second weight normalization feature vector;

Nonlinear activation is carried out on the second weight normalized feature vector through a second correction linear unit, and a second activated feature vector is obtained;

Randomly discarding vector elements in the second activated feature vector through a second random discarding layer to obtain a second discarded feature vector;

And obtaining a first stage characteristic vector corresponding to each attack stage by residual errors and layers on the sample alarm sequence vector and the second discarded characteristic vector.

Optionally, the first stage feature vector generation sub-module further comprises a channel attention model processing unit (not shown) and a spatial attention model processing unit (not shown), the channel attention model processing unit being configured to:

determining a sample channel attention feature vector based on the first sample spread convolution vector and the channel attention coefficient;

The spatial attention model processing unit is used for:

Optionally, the training module further comprises a loss function sub-module for:

calculating a loss function based on a sample detection attack stage corresponding to the sample alarm sequence vector and a sample attack stage label;

based on the loss function, a time series convolution network and classifier are trained.

Referring to fig. 20, fig. 20 is a partial block diagram of a terminal implementing a network attack detection method according to an embodiment of the present disclosure. The terminal 110 includes: radio Frequency (RF) circuitry 2010, memory 2015, input unit 2030, display unit 2040, sensor 2050, audio circuitry 2060, wireless fidelity (WIRELESS FIDELITY, wiFi) module 2070, processor 2080, and power supply 2090. It will be appreciated by those skilled in the art that the configuration of terminal 110 shown in fig. 20 is not limiting of a cell phone or computer and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

The RF circuit 2010 may be used for receiving and transmitting signals during the process of receiving and transmitting information or communication, in particular, after receiving downlink information of the base station, the downlink information is processed by the processor 2080; in addition, the data of the design uplink is sent to the base station.

The memory 2015 may be used to store software programs and modules, and the processor 2080 executes various functional applications and data processing of the terminal 110 by executing the software programs and modules stored in the memory 2015.

The input unit 2030 may be used for receiving input numerical or character information and generating key signal inputs related to setting and function control of the terminal 110. Specifically, the input unit 2030 may include a touch panel 2031 and other input devices 2032.

The display unit 2040 may be used to display input information or provided information and various menus of the terminal 110. The display unit 2040 may include a display panel 2041.

Audio circuitry 2060, speaker 2061, microphone 2062 may provide an audio interface.

In this embodiment, the processor 2280 included in the terminal 110 may perform the network attack detection method of the previous embodiment.

The terminal 110 of the embodiments of the present disclosure includes, but is not limited to, a mobile phone, a computer, an intelligent voice interaction device, an intelligent home appliance, a vehicle-mounted terminal, an aircraft, etc. The embodiment of the invention can be applied to various scenes, including but not limited to cloud technology, artificial intelligence, intelligent transportation, auxiliary driving and the like.

Referring to fig. 21, fig. 21 is a partial block diagram of a server performing a network attack detection method according to an embodiment of the present disclosure. The server 120 may vary considerably in configuration or performance and may include one or more central processing units (Central Processing Units, simply CPU) 2122 (e.g., one or more processors) and memory 2132, one or more storage mediums 2130 (e.g., one or more mass storage devices) that store applications 2142 or data 2144. Wherein the memory 2132 and the storage medium 2130 may be transient storage or persistent storage. The program stored in the storage medium 2130 may include one or more modules (not shown), each of which may include a series of instruction operations on the server 120. Still further, the central processor 2122 may be configured to communicate with a storage medium 2130 and execute a series of instruction operations in the storage medium 2130 on the server 120.

The server 120 can also include one or more power supplies 2126, one or more wired or wireless network interfaces 2150, one or more input/output interfaces 2158, and/or one or more operating systems 2141, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, and the like.

A processor in server 120 may be used to perform the network attack detection method of the embodiments of the present disclosure.

The embodiment of the disclosure also provides an electronic device, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the network attack detection method when executing the computer program.

The embodiments of the present disclosure also provide a computer readable storage medium storing a program code for executing the network attack detection method of the foregoing embodiments.

The disclosed embodiments also provide a computer program product comprising a computer program. The processor of the computer device reads the computer program and executes it, so that the computer device executes the network attack detection method described above.

The terms "first," "second," "third," "fourth," and the like in the description of the present disclosure and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein, for example. Furthermore, the terms "comprises," "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that in this disclosure, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

It should be understood that in the description of the embodiments of the present disclosure, the meaning of a plurality (or multiple) is two or more, and that greater than, less than, exceeding, etc. is understood to not include the present number, and that greater than, less than, within, etc. is understood to include the present number.

In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the various embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory RAM), a magnetic disk, or an optical disk, etc., which can store program codes.

It should also be appreciated that the various implementations provided by the embodiments of the present disclosure may be arbitrarily combined to achieve different technical effects.

The above is a specific description of the embodiments of the present disclosure, but the present disclosure is not limited to the above embodiments, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present disclosure, and are included in the scope of the present disclosure as defined in the claims.

The embodiments of the present disclosure have been described in detail above with reference to the accompanying drawings, but the present disclosure is not limited to the above embodiments, and various changes can be made within the knowledge of one of ordinary skill in the art without departing from the spirit of the present disclosure. Furthermore, embodiments of the present disclosure and features in the embodiments may be combined with each other without conflict.

Claims

1. A network attack detection method, comprising:

2. The network attack detection method according to claim 1, wherein the generating a base attack alarm sequence vector based on the reference attack alarm sequence vector comprises:

3. The cyber attack detection method according to claim 1 wherein the generating a sample alert sequence vector based on the base attack alert sequence vector comprises:

4. The network attack detection method according to claim 1, wherein the step of inputting the sample alarm sequence vector into a sequential convolution network and a classifier sequentially connected to obtain a sample detection attack stage corresponding to the sample alarm sequence vector includes:

5. The network attack detection method according to claim 4, wherein the time series convolution network comprises a first extended causal convolution layer, a channel attention model, a spatial attention model, a multiplier, a second extended causal convolution layer, and a residual sum layer;

inputting the sample alarm sequence vector into the time sequence convolution network to obtain a first stage feature vector corresponding to each attack stage, wherein the method comprises the following steps:

6. The network attack detection method according to claim 5, wherein the time-series convolution network further comprises a first weight normalization layer, a first correction linearity unit, and a first random discard layer;

Inputting the first sample expansion convolution vector into the channel attention model to obtain a sample channel attention feature vector, wherein the method comprises the following steps:

7. The network attack detection method according to claim 5, wherein the time-series convolution network further comprises a second weight normalization layer, a second correction linearity unit, and a second random discard layer;

And performing residual error and processing on the sample alarm sequence vector and the second sample spread convolution vector through the residual error and the layer to obtain a first stage feature vector corresponding to each attack stage, wherein the method comprises the following steps:

8. A network attack detection device, comprising:

9. An electronic device comprising a memory storing a computer program and a processor implementing the network attack detection method according to any of claims 1 to 7 when the computer program is executed by the processor.

10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the network attack detection method according to any of claims 1 to 7.