CN113190844B

CN113190844B - Detection method, correlation method and correlation device

Info

Publication number: CN113190844B
Application number: CN202110552183.8A
Authority: CN
Inventors: 顾立明
Original assignee: Sangfor Technologies Co Ltd
Current assignee: Sangfor Technologies Co Ltd
Priority date: 2021-05-20
Filing date: 2021-05-20
Publication date: 2024-05-28
Anticipated expiration: 2041-05-20
Also published as: CN113190844A

Abstract

The application discloses a detection method, which comprises the following steps: acquiring a terminal behavior log; carrying out event association feature extraction processing on the terminal behavior log according to the causal relationship between the entities in the terminal behavior log to obtain features to be detected; and detecting the feature to be detected by adopting a pre-trained behavior detection model to obtain a detection result. And carrying out event association characteristic extraction processing on the terminal behavior log through the causal relationship among the entities in the terminal behavior log to obtain the characteristics to be detected, so that association relationship of non-time sequences is formed among discrete events, and finally, carrying out terminal behavior detection by adopting a behavior detection model instead of carrying out matching detection on single discrete behavior characteristics by adopting a manual rule mode, thereby improving the detection effect and accuracy of the target entity. The application also discloses a behavior detection model generation method, a detection device, a behavior detection model generation device, a server and a computer readable storage medium, which have the beneficial effects.

Description

Detection method, correlation method and correlation device

Technical Field

The present application relates to the field of computer technology, and in particular, to a detection method, a behavior detection model generation method, a detection apparatus, a behavior detection model generation apparatus, a server, and a computer readable storage medium.

Background

In the field of security of terminals, it is required to detect each behavior in a behavior log generated by the terminal in order to detect a target entity that determines a target behavior and performs the target behavior. For example, it may be a malicious entity that detects suspicious or malicious behavior and performs such behavior.

In general, a target entity in a terminal may be detected by means of feature matching. But is easily disguised by the target entity in a feature matching manner, thereby evading detection. Therefore, in order to improve the detection accuracy of the target entity, a behavior detection scheme based on the terminal behavior log is commonly used at present.

In the related art, a terminal behavior log obtained by recording is matched with a rule obtained by writing by a technician in a rule writing manner, so that analysis and detection of a target entity are realized. However, the quality and effect of the rule are seriously dependent on experience of technicians, so that the detection range of the target entity is insufficient, the target entity cannot be comprehensively detected, and the effect and accuracy of target entity detection are reduced.

Therefore, how to improve the effect of target entity detection is a major concern for those skilled in the art.

Disclosure of Invention

The application aims to provide a detection method, a behavior detection model generation method, a detection device, a behavior detection model generation device, a server and a computer readable storage medium, which solve the problem of low accuracy of the existing detection means.

In order to solve the above technical problems, the present application provides a detection method, including:

acquiring a terminal behavior log;

carrying out event association feature extraction processing on the terminal behavior log according to the causal relationship between the entities in the terminal behavior log to obtain features to be detected;

And detecting the feature to be detected by adopting a pre-trained behavior detection model to obtain a detection result.

Optionally, the behavior detection model includes a graph neural network model.

Optionally, the step of extracting the event-related feature from the terminal behavior log according to the causal relationship between the entities in the terminal behavior log to obtain the feature to be detected includes:

carrying out format consistency processing on the terminal behavior log according to a preset format to obtain a preprocessing log;

And carrying out event association feature extraction processing on the preprocessing log according to the causal relationship among the entities in the preprocessing log to obtain the feature to be detected.

Performing triple analysis on the terminal behavior log to obtain the entity of each event in the terminal behavior log and the operation corresponding to the causal relationship between the entities;

and constructing the feature to be detected according to the entity, the operation and the causal relationship among the entities.

Optionally, the step of constructing the feature to be detected according to the entity, the operation, and the causal relationship among the entities includes:

Integrating the entities corresponding to all the events in the terminal behavior log to obtain a plurality of entities to be associated corresponding to all the events;

and carrying out network structure association on the operation and the entities to be associated according to the causal relationship among the entities to be associated to obtain the characteristics to be detected.

Aiming at the event in each terminal behavior log, constructing a causal relationship edge based on the entity, the operation and the causal relationship among the entities; wherein endpoints in the causal edge characterize the entities, and directed edges in the causal edge characterize operations between entities;

And carrying out directed graph aggregation on all the causal relationship edges to obtain the feature to be detected.

Optionally, the step of performing triple analysis on the terminal behavior log to obtain an entity of each event in the terminal behavior log and an operation corresponding to a causal relationship between the entities includes:

Carrying out structural analysis on the terminal behavior log according to the triplet structure to obtain a triplet structure of each event in the terminal behavior log;

Carrying out attribute analysis on the terminal behavior log according to an attribute information format to obtain attribute information of each event in the terminal behavior log;

And adding the attribute information of each event to the corresponding position of the corresponding triplet structure to obtain the entity of each event in the terminal behavior log and the operation corresponding to the causal relationship between the entities.

The application also provides a behavior detection model generation method, which comprises the following steps:

Acquiring a terminal behavior log training set;

Performing event association feature extraction processing on the terminal behavior log training set according to causal relation among entities in the terminal behavior log training set to obtain feature data to be trained;

and training a detection model according to the feature data to be trained to obtain a behavior detection model.

Optionally, the behavior detection model includes a graph neural network model.

Optionally, the step of performing event association feature extraction processing on the terminal behavior log training set according to the causal relationship between the entities in the terminal behavior log training set to obtain feature data to be trained includes:

performing format consistency processing on the terminal behavior log training set according to a preset format to obtain a preprocessing log;

and carrying out event association feature extraction processing on the preprocessing log according to the causal relationship among the entities in the preprocessing log to obtain the feature data to be trained.

performing triple analysis on the terminal behavior log training set to obtain the entity of each event in the terminal behavior log training set and the operation corresponding to the causal relationship among the entities;

and constructing the feature data to be trained according to the entity, the operation and the causal relationship among the entities.

Optionally, the step of constructing the feature data to be trained according to the entity, the operation, and the causal relationship among the entities includes:

integrating the entities corresponding to all the events in the terminal behavior log training set to obtain a plurality of entities to be associated corresponding to all the events;

and carrying out network structure association on the operation and the entities to be associated according to the causal relationship among the entities to be associated to obtain the feature data to be trained.

Aiming at the events in each terminal behavior log training set, constructing a causal relationship edge based on the entity, the operation and the causal relationship among the entities; wherein endpoints in the causal edge characterize the entities, and directed edges in the causal edge characterize operations between entities;

and carrying out directed graph aggregation on all the causal relationship edges to obtain the feature data to be trained.

Optionally, the step of performing triple analysis on the terminal behavior log training set to obtain an entity of each event in the terminal behavior log training set and an operation corresponding to a causal relationship between the entities includes:

The application also provides a detection device, comprising:

the behavior log acquisition module is used for acquiring a terminal behavior log;

the event correlation feature extraction module is used for carrying out event correlation feature extraction processing on the terminal behavior log according to the causal relationship among the entities in the terminal behavior log to obtain features to be detected;

And the feature detection module is used for detecting the feature to be detected by adopting a pre-trained behavior detection model to obtain a detection result.

The application also provides a behavior detection model generating device, which is characterized by comprising the following steps:

The training set acquisition module is used for acquiring a terminal behavior log training set;

The diagram feature extraction module is used for carrying out event association feature extraction processing on the terminal behavior log training set according to the causal relationship among the entities in the terminal behavior log training set to obtain feature data to be trained;

And the model training module is used for carrying out detection model training according to the feature data to be trained to obtain a behavior detection model.

The application also provides a server, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the detection method as described above and/or the steps of the behavior detection model generation method as described above when executing the computer program.

The present application also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the detection method as described above and/or the steps of the behavior detection model generation method as described above.

The detection method provided by the application comprises the following steps: acquiring a terminal behavior log; carrying out event association feature extraction processing on the terminal behavior log according to the causal relationship between the entities in the terminal behavior log to obtain features to be detected; and detecting the feature to be detected by adopting a pre-trained behavior detection model to obtain a detection result.

And carrying out event association characteristic extraction processing on the terminal behavior log through the causal relationship among the entities in the terminal behavior log to obtain the characteristics to be detected, so that association relationship of non-time sequences is formed among discrete events, and finally, carrying out terminal behavior detection by adopting a behavior detection model instead of carrying out matching detection on single discrete behavior characteristics by adopting a manual rule mode, thereby improving the detection effect and accuracy of the target entity.

The application also provides a behavior detection model generation method, a detection device, a behavior detection model generation device, a server and a computer readable storage medium, which have the advantages and are not described herein.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a detection method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a first feature of the detection method according to the embodiment of the present application;

FIG. 3 is a schematic diagram of a second feature of the detection method according to the embodiment of the present application;

FIG. 4 is a schematic diagram of a third feature of the detection method according to the embodiment of the present application;

FIG. 5 is a schematic diagram of a fourth feature of the detection method according to the embodiment of the present application;

FIG. 6 is a flowchart of a behavior detection model generation method according to an embodiment of the present application;

FIG. 7 is a flowchart of another detection method according to an embodiment of the present application;

Fig. 8 is a schematic structural diagram of a detection device according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a behavior detection model generating device according to an embodiment of the present application.

Detailed Description

The core of the application is to provide a detection method, a behavior detection model generation method, a detection device, a behavior detection model generation device, a server and a computer readable storage medium, which solve the problem of low accuracy of the existing detection means.

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

In the related art, a terminal behavior log obtained by recording is matched with a rule obtained by writing by a technician in a rule writing manner, so that analysis and detection of a target entity are realized. However, the quality and effect of the rule depend on the experience of the technician seriously, which results in insufficient coverage of the detection range of the target entity and failure to detect the target entity comprehensively.

In another related art, a model is trained based on training data by means of machine learning, and then prediction is performed. Such schemes typically organize behavior into a time-sequential based sequence, and then employ algorithms such as N-grams to extract features based on which model training and prediction is performed. The main disadvantage of such schemes is that, because the behaviors are simply organized into time sequences, the complete information of the attack scene cannot be described, and particularly, in the case of involving a plurality of similar entities (for example, a plurality of processes are running simultaneously), an ideal detection effect cannot be achieved, and the detection effect and accuracy of the target entity are reduced.

Therefore, the application provides a detection method, which is used for extracting event association characteristics from a terminal behavior log through causal relation among entities in the terminal behavior log to obtain characteristics to be detected, so that association relation of non-time sequences is formed among discrete events, and finally, the terminal behavior is detected by adopting a behavior detection model instead of matching and detecting single discrete behavior characteristics by adopting a manual rule mode, thereby improving the detection effect and accuracy of target entities.

In order to improve the efficiency and accuracy of detecting the terminal behavior log, discrete log content is detected at the causal relation angle instead of the time relation, information is prevented from being lost at the time angle, and the accuracy and efficiency of log detection are improved. The application is provided by an embodiment

Referring to fig. 1, fig. 1 is a flowchart of a detection method according to an embodiment of the application.

In this embodiment, the method may include:

s101, acquiring a terminal behavior log;

Therefore, the step aims at acquiring the terminal behavior log, and the acquired terminal behavior log is log data needing log detection. The manner of acquiring the terminal behavior log may be acquired from the terminal, may be acquired from a database, or may be acquired from historical data, which is not limited herein.

Further, the terminal behavior log may also be obtained from a different data source. The data sources may be different terminals, or may take pre-stored log data from a database, or may take real-time data from a terminal and pre-stored log data from a database. It can be seen that the manner of obtaining the terminal behavior log in this embodiment is not unique, and is not specifically limited herein. The corresponding acquisition mode can be selected according to the actual application situation.

The terminal refers to a terminal device, which generally refers to a general-purpose computer input/output device with a communication processing control function, and common terminal forms include a PC (Personal Computer ), a server, a mobile terminal device, and the like. Among these, various types of entities exist on the terminal, such as: users, processes, files, networks, etc. The activity of these entities on the terminal may manifest itself in various behaviors, namely terminal behavior, such as: a certain user has run a certain process, a certain process has created a certain file, a certain process has accessed a certain network resource, etc.

The terminal behavior log is log data obtained by recording terminal behaviors in the current terminal monitoring product, the current terminal security product or the current terminal auditing product. That is, the terminal behavior is recorded in a terminal behavior log manner, and some of the terminal behavior logs may be logs of the terminal operating system, and some of the terminal behavior logs may be monitored and recorded by a third party product.

S102, carrying out event association feature extraction processing on the terminal behavior log according to the causal relationship among the entities in the terminal behavior log to obtain features to be detected;

On the basis of S101, this step aims at extracting features to be detected from the terminal behavior log. Specifically, the event association feature extraction processing is performed on the terminal behavior log according to the causal relationship between the entities in the terminal behavior log, so as to obtain the feature to be detected. The method can be a directed graph which analyzes the causal relationship of each event in the terminal behavior day and is obtained by aggregating all the causal relationships to characterize the event association characteristics. Or associating the entities corresponding to all the events according to the causal relationship between each entity to obtain a mesh structure diagram for representing the event association characteristics.

One or more events can be included in the terminal behavior log, and each event further includes at least one entity. When each event includes more than one entity, there is a corresponding causal relationship between each entity. Meanwhile, the same entities exist among each event, so that the same entities in a plurality of events are connected through the causal relationship among each entity, and the event association characteristic associated with each event is obtained and used as the feature to be detected.

In the prior art, when the log content is processed, events in the log content are generally associated in a time sequence manner, and features are further extracted so as to train and detect a neural network model. However, the events in the terminal behavior log are relatively complex to occur, and typically do not occur sequentially in time. Therefore, the time sequence is characterized, most of characteristic information is missed and lost, and the training and detecting effects of the neural network model are reduced. The problem is that the events with the relationships are not distributed according to time on the execution events due to the problems of dependence, waiting and the like of the execution process among the events with the relationships.

In the application, each event is linked according to the entity in the causal relationship, so that the problem caused by time linking is avoided. And, the causal relationship, namely the ternary structure of the main predicate is the most basic structure in an event. The method realizes that the events of the same entity can be linked under different time spans instead of linking the events only by means of time sequence, and improves the detection accuracy of the terminal behavior log.

For example, the entity a creates the entity B, and the entity B modifies the entity C, and the occurrence sequence between the two events may belong to one of the event association features in the present application. Entity a creates entity B and entity C modifies entity B, the association of these two events is not a strict causal relationship, the relationship between these two events is only the same entity B, but it is also one of the event association features in the present application. Similarly, if entity a creates entity B, entity a modifies entity C, and the association of these two events is not performed in time sequence, but is also one of the event association features in the present application. Compared with the time sequence associated features, the features to be detected extracted by the technical scheme have a larger range and can contain more associated features among events, so that the information quantity in the features to be detected is improved, and the accuracy of detection and training is improved.

Further, in order to improve the integrity of feature extraction and the efficiency of feature extraction, reduce data redundancy, improve the efficiency and effect of model detection, avoid feature extraction caused by format inconsistency, and reduce the effect of feature extraction, the steps may include:

Step 1, carrying out format consistency processing on a terminal behavior log according to a preset format to obtain a preprocessing log;

And step 2, carrying out event association feature extraction processing on the preprocessing log according to the causal relationship among the entities in the preprocessing log to obtain the feature to be detected.

It can be seen that the present alternative mainly describes how the feature to be detected is performed. In the alternative scheme, format consistency processing is carried out on the terminal behavior log according to a preset format to obtain a preprocessing log; and then, carrying out event association feature extraction processing on the preprocessing log according to the causal relationship among the entities in the preprocessing log to obtain the feature to be detected. The format consistency processing mainly processes the terminal behavior log into the log with the same format as the preset format, namely processes the logs with different formats into the log with the same format, so that the problems of reduced processing efficiency and processing effect caused by different formats of the logs are avoided.

That is, the format of the terminal behavior log training set is subjected to the unification processing on the basis of the preset format so as to unify the formats among the log data, so that the processing speed of the log data is improved. For example, user login logs generated at different operating system versions may have different formats, fields, and meanings. In order to improve the speed and efficiency of data processing, the format of the data can be processed uniformly so as to improve the efficiency of feature extraction.

Further, in order to implement the event-related feature extraction process to extract features for detection from the terminal behavior log, the step may include:

step 1, performing triplet analysis on a terminal behavior log to obtain an entity of each event in the terminal behavior log and an operation corresponding to a causal relationship between the entities;

And 2, constructing the feature to be detected according to the entity, the operation and the causal relationship among the entities.

It can be seen that this embodiment mainly describes how the feature to be detected is performed. In the alternative scheme, the terminal behavior log is firstly subjected to triplet analysis to obtain the entity of each event in the terminal behavior log and the operation corresponding to the causal relationship among the entities; and then, constructing the feature to be detected according to the entity, the operation and the causal relation among the entities.

Since the event of the terminal behavior log is generally that the entity a redirects to the entity B by performing an operation. Therefore, the terminal behavior log can be subjected to triplet analysis according to the triplet structure from the entity A to the operation to the entity B. The extraction process may be to extract events from a terminal behavior log, then parse each event in a triple structure of "main predicate" and use "subject" and "object" as entities and use "predicate" as operation. The analyzed triple structure of the main guest is a representation form of the causal relationship between the entities. Furthermore, the causal relationship can be expressed in the form of edges, that is, entities in the event are used as endpoints, the operations are used as edges between the endpoints, and finally, the endpoints are the entities, and the edges are the operations.

And finally, constructing the feature to be detected according to the entity, the operation and the causal relation among the entities. Namely, the entity, the operation and the causal relation among the entities are constructed to correlate each event to obtain a network structure, namely the feature to be detected.

Furthermore, in order to improve the efficiency of constructing the feature to be detected, an accurate feature to be detected is constructed. The step of constructing the feature to be detected according to the entity, the operation and the causal relationship among the entities in the previous alternative may include:

Step 1, integrating entities corresponding to all events in all terminal behavior logs to obtain a plurality of entities to be associated corresponding to all events;

And step 2, performing network structure association on the operation and the plurality of entities to be associated according to the causal relationship among the plurality of entities to be associated, so as to obtain the characteristics to be detected.

Therefore, the technical scheme of the application mainly explains how to construct the feature to be detected. In the alternative scheme, the entities corresponding to all the events in the behavior logs of all the terminals are integrated first to obtain a plurality of entities to be associated corresponding to all the events. Because a plurality of events exist in the terminal behavior log, or a plurality of events exist in the acquired plurality of terminal behavior logs. The same entity may exist in different events in multiple events, so this step may be to deduplicate the same entity in multiple events, so as to integrate all entities in multiple events into one entity set, that is, multiple entities to be associated. And then, carrying out network structure association on the operation and the plurality of entities to be associated according to the causal relationship among the plurality of entities to be associated, so as to obtain the characteristics to be detected. That is, the plurality of entities to be associated are associated according to causal relationships between the entities.

For example, there are a plurality of events in which an entity includes entity a, entity B, entity C, entity D, entity a, entity B, entity D, entity a, entity C. And removing the repeated entities, and integrating the repeated entities into an entity A, an entity B, an entity C and an entity D. The entity A, the entity B, the entity C and the entity D are the entities to be associated. And then, directly associating the plurality of entities to be associated according to the causal relationship of all the events, and not associating according to each event, thereby improving the efficiency of constructing the characteristics to be detected. And, since all entities are entity a, entity B, entity C, entity D. Therefore, all the causal relationships are associated and aggregated, and information is not lost.

Step 1, constructing a causal relationship edge based on entities, operations and causal relationships among the entities aiming at events in each terminal behavior log; wherein endpoints in the causal edge characterize the entities and directed edges in the causal edge characterize the operations between the entities;

And step 2, carrying out directed graph aggregation on all causal relationship edges to obtain the feature to be detected.

Therefore, the technical scheme of the application mainly explains how to construct the feature to be detected. In the alternative scheme, first, aiming at events in each terminal behavior log, a causal relationship edge is constructed based on entities, operations and causal relationships among the entities; wherein endpoints in the causal edge characterize the entities and directed edges in the causal edge characterize the operations between the entities. And carrying out directed graph aggregation on all causal relation edges to obtain the feature to be detected. Namely, the directed graph construction is carried out on the entities, the operations and the causal relationships of all the events, and finally the obtained directed graph data is the feature to be detected.

Wherein a graph is first mathematically described in terms of a structure of a set of objects, some of which are related in a sense. These objects correspond to mathematical abstractions called vertices (also called endpoints or points), and each associated vertex pair is called an edge (also called a link or line). Generally, a graph depicts in diagrammatic form a set of points or rings that are vertices and are connected by lines or curves of edges. Further, in computer science, a graph is a collection of vertices that are paired (connected) by a series of edge nodes. Vertices are represented by circles and edges are the lines between the circles. The vertices are connected by edges to form the data structure of the graph.

Therefore, the feature data extracted in this step is the graph data formed by the terminal entity as the end point and the inter-entity behavior as the edge. It can be seen that the terminal behavior log is processed in the form of endpoints and edges in order to distinguish the entities in the terminal behavior log.

Further, in order to increase the density of information in the feature to be detected and increase the amount of information for each operation of each entity, attribute parameters may be added to the entity and the operation in this alternative. Therefore, in the above alternative, the step of performing triple analysis on the terminal behavior log to obtain the entity of each event in the terminal behavior log and the operation corresponding to the causal relationship between the entities may include:

Step 1, carrying out structural analysis on the terminal behavior log according to a triplet structure to obtain a triplet structure of each event in the terminal behavior log;

step 2, carrying out attribute analysis on the terminal behavior log according to an attribute information format to obtain attribute information of each event in the terminal behavior log;

And step 3, adding the attribute information of each event to the corresponding position of the corresponding triple structure to obtain the entity of each event in the terminal behavior log and the operation corresponding to the causal relationship between the entities.

It can be seen that, in this alternative, attribute information corresponding to the entity and the operation may be added as attribute parameters, so as to increase the information amount of the entity and the operation. The attribute information of the entity may be an entity name, an entity creation time, etc., and the attribute information of the operation may be an operation time, etc.

Further, in order to implement graph feature extraction processing on the terminal behavior log, features in the form of graphs are extracted from the terminal behavior log, in this embodiment, the step of performing event association feature extraction processing on the terminal behavior log according to causal relationships between entities in the terminal behavior log to obtain features to be detected may include:

step 1, carrying out edge extraction processing on each log content in a terminal behavior log to obtain a plurality of endpoints and a plurality of edges corresponding to the endpoints;

And 2, polymerizing the edges with the same end points to obtain the characteristics to be detected.

It can be seen that this alternative mainly explains how the graph feature extraction process is performed. In the alternative scheme, first, edge extraction processing is performed on each log content in the terminal behavior log to obtain a plurality of endpoints and a plurality of edges corresponding to the endpoints. That is, each piece of log data is first converted into the form of endpoints and edges. Wherein, the log content is generally redirected to the entity B by the entity A through executing operation. Thus, log content may be converted into multiple endpoints and edges according to the triplet structure from entity a to operation to entity B. And then, aggregating the edges with the same endpoints to obtain a graph training set, and taking the graph training set as feature data to be trained.

Each piece of log content is the log content of each event.

Furthermore, in order to maintain the integrity of the information, corresponding attributes can be added into each endpoint and each edge, so that the integrity of the causal feature representation is improved, and the problem of information loss is avoided. In the above alternative, the step of performing edge extraction processing on each log content in the terminal behavior log to obtain multiple endpoints and multiple edges corresponding to the multiple endpoints may include:

step 1, performing triple analysis on log content to obtain a first entity, an operation and a second entity corresponding to the log content;

Step 2, taking the first entity as a first endpoint, taking the second entity as a second endpoint, and taking the operation as an attribute of an edge between the first endpoint and the second endpoint;

and step3, combining all the first endpoints, all the second endpoints and the attributes of the corresponding edges to obtain a plurality of endpoints and a plurality of corresponding edges.

Referring to fig. 2, fig. 2 is a schematic diagram illustrating a first feature of a detection method according to an embodiment of the application.

It can be seen that this alternative is mainly illustrative of how the extraction of the endpoints and edges can be performed. In the alternative scheme, the log content is firstly subjected to triplet analysis to obtain a first entity, an operation and a second entity corresponding to the log content; then, taking the first entity as a first endpoint, taking the second entity as a second endpoint, and taking the operation as an attribute of an edge between the first endpoint and the second endpoint; and finally, combining all the first endpoints, all the second endpoints and the attributes of the corresponding edges to obtain a plurality of endpoints and a plurality of corresponding edges. In addition, after determining the first endpoint, the second endpoint and the connection line (edge) between the first endpoint and the second endpoint in the present alternative, the attribute of the first entity may be added to the first endpoint, the attribute of the second entity may be added to the second endpoint, the attribute of the operation may be added to the attribute of the edge, and finally the corresponding endpoint and edge may be obtained. Wherein, the attribute of the entity can be entity name, entity creation time, etc., and the attribute of the operation can be operation time, etc.

Further, in order to aggregate the edges and the end points in the previous alternative to form a graph, and implement the graph feature extraction processing, the step of "aggregating the edges with the same end points to obtain the feature to be detected" in the previous alternative may include:

and connecting the same endpoints in the multiple edges to obtain the feature to be detected.

Referring to fig. 3, fig. 3 is a schematic diagram illustrating a second feature of the detection method according to the embodiment of the application.

In this alternative solution, the same end points of the multiple edges may be directly connected to form a corresponding graph, so as to obtain the feature to be detected.

Further, in order to implement the directed graph feature extraction processing on the terminal behavior log, features in the form of directed graphs are extracted from the terminal behavior log. In this embodiment, the step of performing event association feature extraction processing on the terminal behavior log according to the causal relationship between the entities in the terminal behavior log to obtain the feature to be detected may include:

Step 1, carrying out directed edge extraction processing on each log content in a terminal behavior log to obtain a plurality of endpoints and a plurality of corresponding directed edges;

and step 2, polymerizing the directional edges with the same end points to obtain the characteristics to be detected.

It can be seen that the present alternative mainly describes how to perform the directed graph feature extraction. In the alternative scheme, first, each log content in the terminal behavior log is subjected to directed edge extraction processing to obtain a plurality of endpoints and a plurality of corresponding directed edges. That is, each piece of log data is first converted into the form of endpoints and directed edges. Wherein, the log content is generally redirected to the entity B by the entity A through executing operation. Thus, the log content can be converted into a directed edge according to the triplet structure from entity a to operation to entity B. And converting the logs to obtain a plurality of directed edges. And then, aggregating the directed edges with the same endpoints in the plurality of directed edges to obtain the feature to be detected.

Each piece of log content is the log content of each event.

Furthermore, in order to maintain the integrity of the information, corresponding attributes can be added into each endpoint and each edge, so that the integrity of the causal feature representation is improved, and the problem of information loss is avoided. Therefore, in the above alternative, the step of performing the directed edge extraction processing on each log content in the terminal behavior log to obtain multiple endpoints and multiple corresponding directed edges may include:

Step 2, taking the first entity as a head end point, taking the second entity as a tail end point, and taking the operation as an attribute of the edge between the head end point and the tail end point;

and step3, combining all the head end points, all the tail end points and the attributes of the corresponding edges to obtain a plurality of end points and a plurality of corresponding directed edges.

Referring to fig. 4, fig. 4 is a schematic diagram illustrating a third feature of the detection method according to the embodiment of the present application.

It can be seen that this alternative is mainly to explain how the extraction of the directed edges is performed. In the alternative scheme, the log content is firstly subjected to triplet analysis to obtain a first entity, an operation and a second entity corresponding to the log content; then, taking the first entity as a head end point, taking the second entity as a tail end point, and taking the operation as an attribute of the edge between the head end point and the tail end point; and finally, combining all the head end points, all the tail end points and the attributes of the corresponding edges to obtain a plurality of end points and a plurality of corresponding directed edges.

In addition, after determining the head point, the tail point and the connection line (edge) between the head point and the tail point in the alternative scheme, the attribute of the first entity may be added to the head point, the attribute of the second entity may be added to the tail point, the attribute of the operation may be added to the attribute of the edge, and finally the corresponding end point and the directed edge may be obtained. Wherein, the attribute of the entity can be entity name, entity creation time, etc., and the attribute of the operation can be operation time, etc.

Further, in order to perform a faster aggregation process on multiple directed edges, the "aggregate the directed edges with the same end points to obtain the feature to be detected" in the previous alternative may include:

and connecting the same endpoints in the plurality of directed edges to obtain the feature to be detected.

Referring to fig. 5, fig. 5 is a schematic diagram illustrating a fourth feature of the detection method according to the embodiment of the present application.

Therefore, in the alternative scheme, the same end points of the plurality of directed edges can be directly connected to form a corresponding directed graph, so that the feature to be detected is obtained.

And S103, detecting the feature to be detected by adopting a pre-trained behavior detection model to obtain a detection result.

On the basis of S102, the method aims at detecting the feature to be detected by adopting a pre-trained behavior detection model to obtain a detection result. The behavior detection model is obtained by training according to a terminal behavior log training set.

The step of training the neural network according to the terminal behavior log training set may refer to the content of the neural network training in the next embodiment, which is not described herein.

Further, to improve the accuracy and efficiency of the detection, the behavior detection model may include a graph neural network model. The graph neural network is a method for processing graph information based on a neural network machine learning algorithm.

In summary, in this embodiment, the causal relationship between the entities in the terminal behavior log is used to perform event association feature extraction processing on the terminal behavior log to obtain the feature to be detected, so that a non-time-series association relationship is formed between discrete events, and finally, the behavior detection model is used to perform terminal behavior detection, instead of performing matching detection on a single discrete behavior feature by using a manual rule, thereby improving the detection effect and accuracy of the target entity.

In order to avoid log analysis at the time sequence angle, discrete log contents are analyzed and detected at the causal relationship angle, and the accuracy of log analysis and detection is improved. The following describes a behavior detection model generation method provided by the present application through an embodiment.

Referring to fig. 6, fig. 6 is a flowchart of a behavior detection model generating method according to an embodiment of the present application.

In this embodiment, the method may include:

S201, acquiring a terminal behavior log training set;

The step aims at acquiring a terminal behavior log training set. The terminal behavior log training set is training set data obtained from different data sources and is mainly used for training a neural network.

The data sources can be different terminals, pre-stored training sets can be taken out of the database, real-time data can be taken out of the terminals to serve as the training sets, and the pre-stored training sets can be taken out of the database. It can be seen that the manner of acquiring the log training set in this embodiment is not unique, and is not specifically limited herein. The corresponding acquisition mode can be selected according to the actual application situation.

S202, carrying out event association feature extraction processing on a terminal behavior log training set according to causal relations among entities in the terminal behavior log training set to obtain feature data to be trained;

On the basis of S201, the step aims at carrying out event association feature extraction processing on the terminal behavior log training set according to the causal relationship among the entities in the terminal behavior log training set to obtain feature data to be trained. Specifically, the event association feature extraction processing is performed on the terminal behavior log according to the causal relationship between the entities in the terminal behavior log, so as to obtain the feature to be detected. The method can be a directed graph which analyzes the causal relationship of each time in the terminal behavior day and is obtained by aggregating all the causal relationships to characterize the event association characteristics. Or associating the entities corresponding to all the events according to the causal relationship between each entity to obtain a mesh structure diagram for representing the event association characteristics.

Further, the content of the previous embodiment of the present step is substantially the same. The difference is that in the above embodiment, the terminal behavior log is processed, and in this embodiment, the terminal behavior log training set is processed. I.e. the names of the processing objects are different, but both are behavior logs. Therefore, the description of this step may refer to the previous embodiment, and will not be described herein.

Further, the step may include:

step1, carrying out format consistency processing on a terminal behavior log training set according to a preset format to obtain a preprocessing log;

and step 2, carrying out event association feature extraction processing on the preprocessing log according to the causal relationship among the entities in the preprocessing log to obtain feature data to be trained.

Further, the step may include:

step 1, performing triplet analysis on a terminal behavior log training set to obtain an entity of each event in the terminal behavior log training set and an operation corresponding to a causal relationship between the entities;

And 2, constructing feature data to be trained according to the entities, the operations and the causal relationship among the entities.

Further, the step of "constructing feature data to be trained according to the entities, operations, and causal relationships among the entities" in the previous alternative may include:

step 1, integrating entities corresponding to all events in all terminal behavior log training sets to obtain a plurality of entities to be associated corresponding to all events;

And step 2, performing network structure association on the operation and the plurality of entities to be associated according to the causal relationship among the plurality of entities to be associated, so as to obtain the feature data to be trained.

Step 1, constructing a causal relationship edge based on entities, operations and causal relationships among the entities aiming at events in each terminal behavior log training set; wherein endpoints in the causal edge characterize the entities and directed edges in the causal edge characterize the operations between the entities;

and step 2, carrying out directed graph aggregation on all causal relationship edges to obtain feature data to be trained.

Further, the step of performing triple analysis on the terminal behavior log training set to obtain the entity of each event in the terminal behavior log training set and the operation corresponding to the causal relationship between the entities in the previous alternative may include:

As above, the details of all the above alternatives are substantially the same as those of the above embodiment, and will not be described herein.

S203, training a detection model according to the feature data to be trained to obtain a behavior detection model.

On the basis of S202, this step aims at performing detection model training according to the feature data to be trained to obtain a behavior detection model.

The training of the detection model in this step may refer to any training method provided in the prior art, and is not specifically limited herein.

Furthermore, in order to improve the accuracy and efficiency of training, a more appropriate neural network model is adopted for detection and network training, and the behavior detection model comprises a graph neural network model. The graph neural network is a method for processing graph information based on a neural network machine learning algorithm.

In summary, according to the embodiment, the feature data to be trained is obtained by extracting the event association features of the terminal behavior log training set, so that association relations of non-time sequences are formed among discrete events, and finally, the behavior detection model is obtained by training the association relations among the terminal behaviors instead of matching and detecting single discrete behavior features in a manual rule mode, thereby improving the detection effect and accuracy of the target entity.

The following describes a detection method provided by the present application by a specific embodiment.

Referring to fig. 7, fig. 7 is a flowchart of another detection method according to an embodiment of the application.

In this embodiment, the method may include:

s301, acquiring a terminal attack behavior log training set;

s302, carrying out graph feature extraction processing on a terminal attack behavior log training set to obtain feature data to be trained;

S303, training the graphic neural network according to the feature data to be trained to obtain a graphic neural network model;

S304, acquiring a terminal behavior log;

S305, performing graph feature extraction processing on the terminal behavior log to obtain features to be detected;

S306, detecting the feature to be detected by adopting a graph neural network model to obtain malicious attack behaviors.

It can be seen that in this embodiment, the model training and detection are mainly performed on malicious attack behaviors in log data, and the graph data serving as features to be detected are obtained by performing graph feature extraction on the terminal behavior log, so that a non-time-series association relationship is formed between discrete events, and finally, the terminal behavior detection is performed by using a behavior detection model instead of performing the matching detection on single discrete behavior features by using a manual rule manner, thereby improving the detection effect and accuracy of malicious entities.

The detection device provided by the embodiment of the application is described below, and the detection device described below and the detection method described above can be referred to correspondingly.

Referring to fig. 8, fig. 8 is a schematic structural diagram of a detection device according to an embodiment of the application.

In this embodiment, the apparatus may include:

a behavior log obtaining module 110, configured to obtain a terminal behavior log;

The event association feature extraction module 120 is configured to perform event association feature extraction processing on the terminal behavior log according to causal relationships between entities in the terminal behavior log, so as to obtain features to be detected;

the feature detection module 130 is configured to detect a feature to be detected by using a pre-trained behavior detection model, so as to obtain a detection result.

Optionally, the behavior detection model comprises a graph neural network model.

Optionally, the detection feature extraction module 120 may include:

the detection preprocessing unit is used for carrying out format consistency processing on the terminal behavior log according to a preset format to obtain a preprocessing log;

The detection extraction unit is used for carrying out event association feature extraction processing on the preprocessing log according to the causal relationship among the entities in the preprocessing log to obtain the feature to be detected.

Optionally, the detection feature extraction module 120 may include:

The detection event analysis unit is used for carrying out triplet analysis on the terminal behavior log to obtain the entity of each event in the terminal behavior log and the operation corresponding to the causal relationship between the entities;

The detection feature construction unit is used for carrying out triple analysis on the terminal behavior log to obtain the entity of each event in the terminal behavior log and the operation corresponding to the causal relationship between the entities.

Optionally, the detection feature construction unit may be configured to integrate entities corresponding to all events in all terminal behavior logs to obtain a plurality of entities to be associated corresponding to all events; and carrying out network structure association on the operation and the plurality of entities to be associated according to the causal relationship among the plurality of entities to be associated, so as to obtain the characteristics to be detected.

Optionally, the detection feature construction unit may be configured to construct, for each event in the terminal behavior log, a causal relationship edge based on an entity, an operation, and a causal relationship among the entities; wherein endpoints in the causal edge characterize the entities and directed edges in the causal edge characterize the operations between the entities; and carrying out directed graph aggregation on all causal relation edges to obtain the feature to be detected.

Optionally, the detection event analysis unit may be configured to perform structural analysis on the terminal behavior log according to a triplet structure, to obtain a triplet structure of each event in the terminal behavior log; carrying out attribute analysis on the terminal behavior log according to an attribute information format to obtain attribute information of each event in the terminal behavior log; and adding the attribute information of each event to the corresponding position of the corresponding triplet structure to obtain the entity of each event in the terminal behavior log and the operation corresponding to the causal relationship between the entities.

The behavior detection model generating device provided by the embodiment of the present application is described below, and the behavior detection model generating device described below and the behavior detection model generating method described above can be referred to correspondingly.

Referring to fig. 9, fig. 9 is a schematic structural diagram of a behavior detection model generating device according to an embodiment of the application.

In this embodiment, the apparatus may include:

a training set obtaining module 210, configured to obtain a terminal behavior log training set;

The training feature extraction module 220 is configured to perform event association feature extraction processing on the terminal behavior log training set according to causal relationships between entities in the terminal behavior log training set, so as to obtain feature data to be trained;

The model training module 230 is configured to perform detection model training according to the feature data to be trained to obtain a behavior detection model.

Optionally, the training feature extraction module 220 may include:

the training preprocessing unit is used for carrying out format conforming processing on the terminal behavior log training set according to a preset format to obtain a preprocessing log;

The training extraction unit is used for carrying out event association feature extraction processing on the preprocessing log according to the causal relationship among the entities in the preprocessing log to obtain feature data to be trained.

Optionally, the training feature extraction module 220 may include:

the training event analysis unit is used for carrying out triple analysis on the terminal behavior log training set to obtain the entity of each event in the terminal behavior log training set and the operation corresponding to the causal relationship among the entities;

The training feature construction unit is used for constructing feature data to be trained according to the entities, the operations and the causal relationship among the entities.

Optionally, the training feature construction unit may be configured to integrate entities corresponding to all events in the training set of all terminal behavior logs to obtain a plurality of entities to be associated corresponding to all events; and carrying out network structure association on the operation and the plurality of entities to be associated according to the causal relationship among the plurality of entities to be associated, so as to obtain the feature data to be trained.

Optionally, the training feature construction unit may be configured to construct, for each event in the training set of the terminal behavior log, a causal relationship edge based on the entity, the operation, and a causal relationship among the entities; wherein endpoints in the causal edge characterize the entities and directed edges in the causal edge characterize the operations between the entities; and carrying out directed graph aggregation on all causal relation edges to obtain feature data to be trained.

Optionally, the training event analysis unit may be configured to perform triplet analysis on the terminal behavior log training set according to an attribute information format to obtain an entity of each event in the terminal behavior log training set, attribute information corresponding to the entity, an operation corresponding to a causal relationship between the entities, and attribute information corresponding to the operation; taking attribute information corresponding to the entity as attribute parameters of the entity, and attaching the attribute information to the entity; and taking the attribute information corresponding to the operation as the attribute parameter of the operation, and attaching the attribute information to the entity.

The embodiment of the application also provides a server, which comprises:

a memory for storing a computer program;

A processor for implementing the steps of the graph behavior detection model generation method as described in the above embodiments and/or the steps of the detection method as described in the above embodiments when executing the computer program.

Embodiments of the present application also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the graph behavior detection model generation method as described in the above embodiments and/or the steps of the detection method as described in the above embodiments.

In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The detection method, the behavior detection model generation method, the detection device, the behavior detection model generation device, the server and the computer readable storage medium provided by the application are described in detail above. The principles and embodiments of the present application have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present application and its core ideas. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the application can be made without departing from the principles of the application and these modifications and adaptations are intended to be within the scope of the application as defined in the following claims.

Claims

1. A method of detection comprising:

acquiring a terminal behavior log;

Performing event association feature extraction processing on the terminal behavior log according to the causal relationship between the entities in the terminal behavior log to obtain features to be detected, so that association relationship of non-time sequences is formed between discrete events;

detecting the feature to be detected by adopting a pre-trained behavior detection model to obtain a detection result;

The step of extracting the event association characteristic of the terminal behavior log according to the causal relationship between the entities in the terminal behavior log to obtain the characteristic to be detected comprises the following steps:

constructing the feature to be detected according to the entity, the operation and the causal relationship among the entities;

The step of constructing the feature to be detected according to the entity, the operation and the causal relationship among the entities includes:

2. The method of claim 1, wherein the behavioral detection model comprises a graph neural network model.

3. The detection method according to claim 1, wherein the step of performing event correlation feature extraction processing on the terminal behavior log according to a causal relationship between entities in the terminal behavior log to obtain a feature to be detected includes:

4. The method of claim 1, wherein the step of constructing the feature to be detected based on the entities, the operations, and causal relationships between the entities comprises:

5. The method according to any one of claims 1 to 4, wherein the step of performing triplet analysis on the terminal behavior log to obtain an entity of each event in the terminal behavior log and an operation corresponding to a causal relationship between the entities includes:

6. A behavior detection model generation method, characterized by comprising:

Acquiring a terminal behavior log training set;

Performing event association feature extraction processing on the terminal behavior log training set according to the causal relationship among entities in the terminal behavior log training set to obtain feature data to be trained, so that association relationship of non-time sequences is formed among discrete events;

training a detection model according to the feature data to be trained to obtain a behavior detection model;

The step of extracting event association characteristics from the terminal behavior log training set according to the causal relationship among the entities in the terminal behavior log training set to obtain the characteristic data to be trained comprises the following steps:

Constructing the feature data to be trained according to the entity, the operation and the causal relationship among the entities;

The step of constructing the feature data to be trained according to the entity, the operation and the causal relation among the entities comprises the following steps:

7. The behavior detection model generation method of claim 6, wherein the behavior detection model comprises a graph neural network model.

8. The method for generating a behavior detection model according to claim 6, wherein the step of performing event correlation feature extraction processing on the training set of terminal behavior logs according to causal relationships between entities in the training set of terminal behavior logs to obtain feature data to be trained comprises:

9. The behavior detection model generation method according to claim 7, wherein the step of constructing the feature data to be trained from the entities, the operations, and causal relationships among the entities, comprises:

10. The behavior detection model generation method according to any one of claims 7 to 9, wherein the step of performing triplet analysis on the terminal behavior log training set to obtain an entity of each event in the terminal behavior log training set and an operation corresponding to a causal relationship between the entities includes:

11. A detection apparatus, characterized by comprising:

The event association feature extraction module is used for carrying out event association feature extraction processing on the terminal behavior log according to the causal relationship among the entities in the terminal behavior log to obtain the feature to be detected, so that association relationship of non-time sequence is formed among discrete events;

The feature detection module is used for detecting the feature to be detected by adopting a pre-trained behavior detection model to obtain a detection result;

12. A behavior detection model generation device, characterized by comprising:

The diagram feature extraction module is used for carrying out event association feature extraction processing on the terminal behavior log training set according to the causal relationship among the entities in the terminal behavior log training set to obtain feature data to be trained, so that association relationship of non-time sequences is formed among discrete events;

The model training module is used for carrying out detection model training according to the feature data to be trained to obtain a behavior detection model;

13. A server, comprising:

a memory for storing a computer program;

Processor for implementing the steps of the detection method according to any one of claims 1 to 5 and/or the steps of the behavior detection model generation method according to any one of claims 6 to 10 when executing the computer program.

14. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the detection method according to any of claims 1 to 5 and/or the steps of the behavior detection model generation method according to any of claims 6 to 10.