CN114186232A - Network attack team identification method and device, electronic equipment and storage medium - Google Patents
Network attack team identification method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN114186232A CN114186232A CN202111519490.2A CN202111519490A CN114186232A CN 114186232 A CN114186232 A CN 114186232A CN 202111519490 A CN202111519490 A CN 202111519490A CN 114186232 A CN114186232 A CN 114186232A
- Authority
- CN
- China
- Prior art keywords
- data set
- cluster
- clustering
- network attack
- team
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000002159 abnormal effect Effects 0.000 claims abstract description 39
- 238000012545 processing Methods 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 10
- 230000006399 behavior Effects 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 230000009471 action Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000004075 alteration Effects 0.000 description 1
- 238000012550 audit Methods 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/566—Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/50—Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computer Hardware Design (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Virology (AREA)
- Probability & Statistics with Applications (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a network attack team identification method, a network attack team identification device, electronic equipment and a storage medium, which are used for solving the technical problem that network attacks cannot be effectively and accurately identified from massive alarm information. The invention comprises the following steps: extracting network attack log data from a preset database; standardizing the network attack log data to obtain a standardized log data set; clustering data objects in the standardized log data set to obtain a plurality of network attack teams; generating a team representation of each of the cyber attack teams; when network abnormal information is received, matching the network abnormal information in the team figures, and determining a target network attack team corresponding to the network abnormal information.
Description
Technical Field
The invention relates to the technical field of network attack identification, in particular to a network attack team identification method and device, electronic equipment and a storage medium.
Background
A cyber attack refers to any type of offensive action directed to a computer information system, infrastructure, computer network, or personal computer device. For computers and computer networks, destroying, revealing, modifying, disabling software or services, stealing or accessing data from any computer without authorization, is considered an attack in computers and computer networks.
Intrusion detection is a reasonable supplement of a firewall, helps a system to deal with network attacks, expands the security management capability of a system administrator (including security audit, monitoring, attack identification and response), and improves the integrity of an information security infrastructure. It collects information from several key points in the computer network system and analyzes the information to see if there is a breach of security policy and evidence of an attack in the network. Intrusion detection is considered as a second security gate behind a firewall and can monitor the network without affecting the network performance, thereby providing real-time protection against internal attacks, external attacks and misoperations.
The current intrusion detection system mainly monitors network traffic, compares the traffic with a rule base according to the characteristics of the traffic, and then alarms according to abnormal traffic. Since the scanning of security manufacturers and hackers on the internet occurs from time to time, the intrusion detection system generates a large amount of alarm information, security personnel are tired of processing a large amount of meaningless alarms, and real attack alarms may be submerged in a large amount of alarm information. Therefore, the network attack cannot be effectively and accurately defended.
Disclosure of Invention
The invention provides a network attack team identification method, a network attack team identification device, electronic equipment and a storage medium, which are used for solving the technical problem that network attacks cannot be effectively and accurately identified from massive alarm information.
The invention provides a network attack team identification method, which comprises the following steps:
extracting network attack log data from a preset database;
standardizing the network attack log data to obtain a standardized log data set;
clustering data objects in the standardized log data set to obtain a plurality of network attack teams;
generating a team representation of each of the cyber attack teams;
when network abnormal information is received, matching the network abnormal information in the team figures, and determining a target network attack team corresponding to the network abnormal information.
Optionally, the step of clustering the data objects in the standardized log data set to obtain a plurality of network attack teams includes:
calculating a first Euler distance for each of said data objects in said normalized log data set and an average distance for all of said data objects;
extracting a first sample from the normalized log dataset according to the first euler distance and the average distance, generating a first sample dataset;
counting a total number of data of a first sample in the first sample dataset;
calculating a first clustering number according to the total data number;
clustering first samples in the first sample data set based on the first clustering number to obtain a second sample data set; the second sample data set includes a plurality of first cluster clusters corresponding to the first cluster number; each first clustering cluster corresponds to one second sample;
calculating a second clustering number according to the first clustering number;
extracting two second samples with the minimum second Euler distance from the second sample data set, generating a second cluster, and adding the second cluster into a preset third sample data set;
judging whether the number of the second cluster in the third sample data set is equal to the second cluster number;
if yes, respectively calculating the arithmetic mean of the first samples in each second cluster;
judging whether the difference values of the arithmetic mean of any two second cluster clusters are both larger than a preset threshold value;
and if so, determining each second clustering cluster as a network attack team.
Optionally, the method further comprises:
and if the number of the second cluster in the third sample data set is not equal to the second cluster number, returning to the step of extracting two second samples with the minimum second Euler distance from the second sample data set, generating a second cluster, and adding the second cluster into a preset third sample data set.
Optionally, the method further comprises:
and if the difference value of the arithmetic mean of the two second cluster types is not larger than a preset threshold value, setting the second cluster number as a first cluster number, setting the third sample data set as a second sample data set, and returning to the step of calculating the second cluster number according to the first cluster number.
Optionally, the step of extracting a first sample from the normalized log data set according to the first euler distance and the average distance to generate a first sample data set includes:
and extracting data objects with the first Euler distance not greater than the average distance from the standardized log data set as first samples, and generating a first sample data set.
The invention also provides a network attack team identification device, which comprises:
the extraction module is used for extracting the network attack log data from a preset database;
the standardized processing module is used for carrying out standardized processing on the network attack log data to obtain a standardized log data set;
the clustering module is used for clustering data objects in the standardized log data set to obtain a plurality of network attack teams;
the team portrait generation module is used for generating a team portrait of each network attack team;
and the target network attack team determining module is used for matching the network abnormal information in the team figures when the network abnormal information is received, and determining a target network attack team corresponding to the network abnormal information.
Optionally, the clustering module includes:
a first euler distance and average distance calculation sub-module for calculating a first euler distance for each of said data objects in said standardized log data set and an average distance for all of said data objects;
a first sample data set generation submodule, configured to extract a first sample from the normalized log data set according to the first euler distance and the average distance, and generate a first sample data set;
a data total generation submodule, configured to count a data total of a first sample in the first sample data set;
the first clustering number calculating submodule is used for calculating a first clustering number according to the total data number;
a second sample data set generation submodule, configured to cluster first samples in the first sample data set based on the first cluster number, so as to obtain a second sample data set; the second sample data set includes a plurality of first cluster clusters corresponding to the first cluster number; each first clustering cluster corresponds to one second sample;
the second clustering number calculating submodule is used for calculating a second clustering number according to the first clustering number;
a second cluster generation sub-module, configured to extract two second samples with a minimum second euler distance from the second sample data set, generate a second cluster, and add the second cluster into a preset third sample data set;
a first determining sub-module, configured to determine whether the number of the second cluster in the third sample data set is equal to the second cluster number;
an arithmetic mean calculation sub-module, configured to calculate an arithmetic mean of the first samples in each of the second cluster if yes;
the second judgment submodule is used for judging whether the difference values of the arithmetic mean of any two second clustering clusters are both larger than a preset threshold value;
and the network attack team determining submodule is used for determining each second clustering cluster as a network attack team if the second clustering cluster is determined to be the network attack team.
Optionally, the method further comprises:
and a first returning sub-module, configured to, if the number of the second cluster in the third sample data set is not equal to the second cluster number, return to the step of extracting two second samples with a minimum second euler distance from the second sample data set, generating a second cluster, and adding the second cluster to a preset third sample data set.
The invention also provides an electronic device comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to execute the network attack team identification method according to any one of the above instructions in the program code.
The present invention also provides a computer-readable storage medium for storing program code for executing the network attack team identification method as described in any one of the above.
According to the technical scheme, the invention has the following advantages: the network attack system and the network attack method have the advantages that a plurality of network attack teams with the same attack behaviors are obtained by clustering network attack log data, and then the network abnormal information received in real time is matched with the network attack teams to quickly judge whether the network abnormal information is the network attack launched by the network attack teams, so that the network attack is effectively and accurately identified.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 is a flowchart illustrating steps of a network attack team identification method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating steps of a network attack team identification method according to another embodiment of the present invention;
fig. 3 is a block diagram of a network attack team identification apparatus according to an embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a network attack team identification method, a network attack team identification device, electronic equipment and a storage medium, which are used for solving the technical problem that network attacks cannot be effectively and accurately identified from massive alarm information.
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart illustrating steps of a network attack team identification method according to an embodiment of the present invention.
The network attack team identification method provided by the invention specifically comprises the following steps:
in the embodiment of the invention, a database storing the network behavior log data can be connected, and the network attack log data needing clustering can be selected from the database.
The network attack log data mainly comprises field information such as source IP, destination IP, generation time, log abstract and the like.
in the embodiment of the present invention, the normalization processing is to determine whether the IP address format of the cyber attack log data is the standard IP address format, and if not, modify the IP address format of the cyber attack log data into the standard IP address format, such as 120.23.44.55, so as to obtain a normalized log data set.
103, clustering data objects in the standardized log data set to obtain a plurality of network attack teams;
after the standardized log data set is obtained, clustering can be performed on data objects in the standardized log data set to obtain a plurality of abnormal data sets, wherein each abnormal data set corresponds to a network attack team.
It is to be understood that clustering can group together attacks having the same manner of behavior, preferred attack methods, and characteristics. Generally speaking, the same behavior, preferred attack methods and features are likely to be derived from the same network attack team. Thus, the standardized log data set is partitioned by a network attack team by clustering.
In one example, for clustering of standardized log data sets, a K-means algorithm may be employed.
The K-means clustering algorithm is an iterative solution clustering algorithm, and the steps are that data are divided into K groups in advance, K objects are randomly selected to serve as initial clustering centers, then the distance between each object and each seed clustering center is calculated, and each object is allocated to the nearest clustering center. The cluster centers and the objects assigned to them represent a cluster.
after the data objects in the standardized log data set are clustered to obtain a plurality of network attack teams, a team representation of each network attack team can be generated. The team representation is used to characterize the behavior patterns, preferences, and attack methods and features of each network attack team.
And 105, matching the network abnormal information in the plurality of team figures when the network abnormal information is received, and determining a target network attack team corresponding to the network abnormal information.
After the team figures of each network attack team are formed, the network attacks of each network attack team can be accurately positioned in the massive network abnormal information, and therefore the real network attacks can be effectively and accurately identified in the massive network abnormal information.
The network attack system and the network attack method have the advantages that a plurality of network attack teams with the same attack behaviors are obtained by clustering network attack log data, and then the network abnormal information received in real time is matched with the network attack teams to quickly judge whether the network abnormal information is the network attack launched by the network attack teams, so that the network attack is effectively and accurately identified.
Referring to fig. 2, fig. 2 is a flowchart illustrating steps of a network attack team identification method according to another embodiment of the present invention. The method specifically comprises the following steps:
the steps 201-202 are similar to the steps 101-102, and for details, reference may be made to the description of the steps 101-102, which is not repeated herein.
in the embodiment of the invention, before clustering, isolated points can be found out from the standardized log data set to eliminate the interference of the isolated points on clustering, so as to obtain a first sample data set without the isolated points.
In particular implementations, isolated points can be filtered by computing a first euler distance of each data object and an average distance of all data objects in the normalized log dataset to filter isolated points by the first euler distance and the average distance of each data object.
In one example, the step of extracting a first sample from the normalized log dataset based on the first euler distance and the average distance, generating a first sample dataset, may comprise:
and extracting data objects with the first Euler distance not greater than the average distance from the standardized log data set as first samples, and generating a first sample data set.
In an embodiment of the present invention, when the first euler distance of the data object is greater than the average distance, the point may be considered to be an isolated point in the normalized log data set and should be deleted. A first sample data set may be generated with data objects for which the first euler distance is not greater than the average distance as first samples.
after the isolated points are eliminated, a total number n of data for the first sample in the first sample data set may be calculated, and a first clustering number k may be calculated from the total number n of data:
k=n^0.5
and then inputting the first sample data set into a K-means algorithm, obtaining K first cluster clusters through operation, and taking each first cluster as a second sample to carry out next clustering.
after the first sample data set is clustered to obtain K first cluster clusters, the second cluster number K of re-clustering can be calculated based on the first cluster number:
and adding two second samples with the minimum second Euler distance in the second sample data set as a second cluster into the third sample data set, and deleting the two second samples from the second sample data set.
and then judging whether the number of second clustering clusters in the third sample data set is equal to that of the second clustering clusters, if so, representing that clustering is finished, and at the moment, respectively calculating the arithmetic mean of the first samples in each second clustering cluster so as to judge whether to finish clustering according to the arithmetic mean.
It should be noted that, in the embodiment of the present invention, the method further includes: and if the number of the second cluster in the third sample data set is not equal to the second cluster number, returning to the step of extracting two second samples with the minimum second Euler distance from the second sample data set, generating the second cluster, and adding the second cluster into a preset third sample data set.
Specifically, when the number of second cluster clusters in the third sample data set is not equal to the second cluster number, which indicates that the clustering is not completed, step 209 may be repeated until the number of second cluster clusters in the third sample data set is equal to the second cluster number.
in the embodiment of the present invention, whether to end clustering may be determined by determining whether the difference between the arithmetic mean of any two second clustering clusters is greater than a preset threshold. If yes, representing that any two second cluster clusters are not similar, wherein data in each second cluster can form an abnormal data set, and the data in the abnormal data set is sent out by the same network attack team.
It is noted that, in the embodiment of the present invention, the method further includes:
and if the difference value of the arithmetic mean of the two second cluster types is not larger than the preset threshold value, setting the second cluster number as the first cluster number, setting the third sample data set as the second sample data set, and returning to the step of calculating the second cluster number according to the first cluster number.
Specifically, when the difference between the arithmetic means of two second cluster clusters is not greater than the preset threshold, the two second cluster clusters become similar, and re-clustering can be performed, so that the second cluster number can be set as the first cluster number, the third sample data set can be set as the second sample data set, and the process returns to step 208. Until the difference of the arithmetic mean of any two second cluster clusters is larger than the preset threshold. It should be noted that the size of the preset threshold may be flexibly set according to an actual application situation, and this is not specifically limited in the embodiment of the present invention.
after the data objects in the standardized log data set are clustered to obtain a plurality of network attack teams, a team representation of each network attack team can be generated. The team representation is used to characterize the behavior patterns, preferences, and attack methods and features of each network attack team.
After the team figures of each network attack team are formed, the network attacks of each network attack team can be accurately positioned in the massive network abnormal information, and therefore the real network attacks can be effectively and accurately identified in the massive network abnormal information.
The network attack system and the network attack method have the advantages that a plurality of network attack teams with the same attack behaviors are obtained by clustering network attack log data, and then the network abnormal information received in real time is matched with the network attack teams to quickly judge whether the network abnormal information is the network attack launched by the network attack teams, so that the network attack is effectively and accurately identified.
Referring to fig. 3, fig. 3 is a block diagram illustrating a network attack team identification apparatus according to an embodiment of the present invention.
The embodiment of the invention provides a network attack team identification device, which comprises:
an extracting module 301, configured to extract network attack log data from a preset database;
the standardization processing module 302 is used for standardizing the network attack log data to obtain a standardization log data set;
the clustering module 303 is configured to cluster data objects in the standardized log data set to obtain a plurality of network attack teams;
a team representation generation module 304, configured to generate a team representation of each cyber attack team;
and the target network attack team determining module 305 is used for matching the network abnormal information in the plurality of team figures when the network abnormal information is received, and determining a target network attack team corresponding to the network abnormal information.
In this embodiment of the present invention, the clustering module 303 includes:
a first euler distance and average distance calculation submodule for calculating a first euler distance of each data object and an average distance of all data objects in the normalized log data set;
the first sample data set generation submodule is used for extracting a first sample from the standardized log data set according to the first Euler distance and the average distance and generating a first sample data set;
the data total generation submodule is used for counting the data total of the first sample in the first sample data set;
the first clustering number calculating submodule is used for calculating a first clustering number according to the total data number;
the second sample data set generation submodule is used for clustering the first samples in the first sample data set based on the first clustering number to obtain a second sample data set; the second sample data set contains a plurality of first cluster clusters corresponding to the first cluster number; each first clustering cluster corresponds to one second sample;
the second clustering number calculating submodule is used for calculating a second clustering number according to the first clustering number;
the second cluster generation sub-module is used for extracting two second samples with the minimum second Euler distance from the second sample data set, generating a second cluster and adding the second cluster into a preset third sample data set;
the first judgment submodule is used for judging whether the number of the second clustering clusters in the third sample data set is equal to the second clustering number or not;
the arithmetic mean calculating submodule is used for calculating the arithmetic mean of the first samples in each second cluster if the first samples in each second cluster are the same as the arithmetic mean;
the second judgment submodule is used for judging whether the difference values of the arithmetic mean of any two second clustering clusters are both larger than a preset threshold value;
and the network attack team determining submodule is used for determining each second clustering cluster as a network attack team if the second clustering cluster is determined to be the network attack team.
In this embodiment of the present invention, the clustering module 303 further includes:
and the first returning submodule is used for returning to the steps of extracting two second samples with the minimum second Euler distance from the second sample data set, generating a second cluster and adding the second cluster into a preset third sample data set if the number of the second cluster in the third sample data set is not equal to the second cluster number.
In this embodiment of the present invention, the clustering module 303 further includes:
and the second returning submodule is used for setting the second clustering number as the first clustering number, setting the third sample data set as the second sample data set and returning to the step of calculating the second clustering number according to the first clustering number if the difference value of the arithmetic mean of the two second clustering clusters is not larger than the preset threshold value.
In an embodiment of the present invention, the first sample data set generation submodule includes:
and a first sample data set generating unit for extracting a data object of which the first euler distance is not more than the average distance from the normalized log data set as a first sample, and generating a first sample data set.
An embodiment of the present invention further provides an electronic device, where the device includes a processor and a memory:
the memory is used for storing the program codes and transmitting the program codes to the processor;
the processor is used for executing the network attack team identification method according to the embodiment of the invention according to the instructions in the program codes.
The embodiment of the invention also provides a computer-readable storage medium which is used for storing the program codes, and the program codes are used for executing the network attack team identification method of the embodiment of the invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A network attack team identification method is characterized by comprising the following steps:
extracting network attack log data from a preset database;
standardizing the network attack log data to obtain a standardized log data set;
clustering data objects in the standardized log data set to obtain a plurality of network attack teams;
generating a team representation of each of the cyber attack teams;
when network abnormal information is received, matching the network abnormal information in the team figures, and determining a target network attack team corresponding to the network abnormal information.
2. The cyber attack team identification method according to claim 1, wherein the step of clustering the data objects in the standardized log data set to obtain a plurality of cyber attack teams comprises:
calculating a first Euler distance for each of said data objects in said normalized log data set and an average distance for all of said data objects;
extracting a first sample from the normalized log dataset according to the first euler distance and the average distance, generating a first sample dataset;
counting a total number of data of a first sample in the first sample dataset;
calculating a first clustering number according to the total data number;
clustering first samples in the first sample data set based on the first clustering number to obtain a second sample data set; the second sample data set includes a plurality of first cluster clusters corresponding to the first cluster number; each first clustering cluster corresponds to one second sample;
calculating a second clustering number according to the first clustering number;
extracting two second samples with the minimum second Euler distance from the second sample data set, generating a second cluster, and adding the second cluster into a preset third sample data set;
judging whether the number of the second cluster in the third sample data set is equal to the second cluster number;
if yes, respectively calculating the arithmetic mean of the first samples in each second cluster;
judging whether the difference values of the arithmetic mean of any two second cluster clusters are both larger than a preset threshold value;
and if so, determining each second clustering cluster as a network attack team.
3. The network attack team identification method of claim 2, further comprising:
and if the number of the second cluster in the third sample data set is not equal to the second cluster number, returning to the step of extracting two second samples with the minimum second Euler distance from the second sample data set, generating a second cluster, and adding the second cluster into a preset third sample data set.
4. The network attack team identification method of claim 2, further comprising:
and if the difference value of the arithmetic mean of the two second cluster types is not larger than a preset threshold value, setting the second cluster number as a first cluster number, setting the third sample data set as a second sample data set, and returning to the step of calculating the second cluster number according to the first cluster number.
5. The cyber attack team identifying method as claimed in claim 2, wherein the step of extracting a first sample from the standardized log data set according to the first Euler distance and the average distance to generate a first sample data set comprises:
and extracting data objects with the first Euler distance not greater than the average distance from the standardized log data set as first samples, and generating a first sample data set.
6. A cyber attack team identifying apparatus, comprising:
the extraction module is used for extracting the network attack log data from a preset database;
the standardized processing module is used for carrying out standardized processing on the network attack log data to obtain a standardized log data set;
the clustering module is used for clustering data objects in the standardized log data set to obtain a plurality of network attack teams;
the team portrait generation module is used for generating a team portrait of each network attack team;
and the target network attack team determining module is used for matching the network abnormal information in the team figures when the network abnormal information is received, and determining a target network attack team corresponding to the network abnormal information.
7. The cyber attack team identifying device according to claim 6, wherein the clustering module comprises:
a first euler distance and average distance calculation sub-module for calculating a first euler distance for each of said data objects in said standardized log data set and an average distance for all of said data objects;
a first sample data set generation submodule, configured to extract a first sample from the normalized log data set according to the first euler distance and the average distance, and generate a first sample data set;
a data total generation submodule, configured to count a data total of a first sample in the first sample data set;
the first clustering number calculating submodule is used for calculating a first clustering number according to the total data number;
a second sample data set generation submodule, configured to cluster first samples in the first sample data set based on the first cluster number, so as to obtain a second sample data set; the second sample data set includes a plurality of first cluster clusters corresponding to the first cluster number; each first clustering cluster corresponds to one second sample;
the second clustering number calculating submodule is used for calculating a second clustering number according to the first clustering number;
a second cluster generation sub-module, configured to extract two second samples with a minimum second euler distance from the second sample data set, generate a second cluster, and add the second cluster into a preset third sample data set;
a first determining sub-module, configured to determine whether the number of the second cluster in the third sample data set is equal to the second cluster number;
an arithmetic mean calculation sub-module, configured to calculate an arithmetic mean of the first samples in each of the second cluster if yes;
the second judgment submodule is used for judging whether the difference values of the arithmetic mean of any two second clustering clusters are both larger than a preset threshold value;
and the network attack team determining submodule is used for determining each second clustering cluster as a network attack team if the second clustering cluster is determined to be the network attack team.
8. The cyber attack team identifying device according to claim 7, further comprising:
and a first returning sub-module, configured to, if the number of the second cluster in the third sample data set is not equal to the second cluster number, return to the step of extracting two second samples with a minimum second euler distance from the second sample data set, generating a second cluster, and adding the second cluster to a preset third sample data set.
9. An electronic device, comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to execute the network attack team identification method of any of claims 1-5 according to instructions in the program code.
10. A computer-readable storage medium characterized in that the computer-readable storage medium is configured to store a program code for executing the network attack team identification method according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111519490.2A CN114186232B (en) | 2021-12-13 | 2021-12-13 | A network attack team identification method, device, electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111519490.2A CN114186232B (en) | 2021-12-13 | 2021-12-13 | A network attack team identification method, device, electronic device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114186232A true CN114186232A (en) | 2022-03-15 |
CN114186232B CN114186232B (en) | 2024-12-20 |
Family
ID=80604738
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111519490.2A Active CN114186232B (en) | 2021-12-13 | 2021-12-13 | A network attack team identification method, device, electronic device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114186232B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115550015A (en) * | 2022-09-23 | 2022-12-30 | 北京中睿天下信息技术有限公司 | An Attack Analysis Method Based on Asset Classification |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108881294A (en) * | 2018-07-23 | 2018-11-23 | 杭州安恒信息技术股份有限公司 | Attack source IP portrait generation method and device based on attack |
CN108924163A (en) * | 2018-08-14 | 2018-11-30 | 成都信息工程大学 | Attacker's portrait method and system based on unsupervised learning |
WO2020147317A1 (en) * | 2019-01-18 | 2020-07-23 | 郑州云海信息技术有限公司 | Method, apparatus, and device for determining network anomaly behavior, and readable storage medium |
CN111800430A (en) * | 2020-07-10 | 2020-10-20 | 南方电网科学研究院有限责任公司 | Attack group identification method, device, equipment and medium |
CN112165462A (en) * | 2020-09-11 | 2021-01-01 | 哈尔滨安天科技集团股份有限公司 | Attack prediction method and device based on portrait, electronic equipment and storage medium |
CN112351031A (en) * | 2020-11-05 | 2021-02-09 | 中国电子信息产业集团有限公司 | Generation method and device of attack behavior portrait, electronic equipment and storage medium |
CN112966264A (en) * | 2021-02-28 | 2021-06-15 | 新华三信息安全技术有限公司 | XSS attack detection method, device, equipment and machine-readable storage medium |
-
2021
- 2021-12-13 CN CN202111519490.2A patent/CN114186232B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108881294A (en) * | 2018-07-23 | 2018-11-23 | 杭州安恒信息技术股份有限公司 | Attack source IP portrait generation method and device based on attack |
CN108924163A (en) * | 2018-08-14 | 2018-11-30 | 成都信息工程大学 | Attacker's portrait method and system based on unsupervised learning |
WO2020147317A1 (en) * | 2019-01-18 | 2020-07-23 | 郑州云海信息技术有限公司 | Method, apparatus, and device for determining network anomaly behavior, and readable storage medium |
CN111800430A (en) * | 2020-07-10 | 2020-10-20 | 南方电网科学研究院有限责任公司 | Attack group identification method, device, equipment and medium |
CN112165462A (en) * | 2020-09-11 | 2021-01-01 | 哈尔滨安天科技集团股份有限公司 | Attack prediction method and device based on portrait, electronic equipment and storage medium |
CN112351031A (en) * | 2020-11-05 | 2021-02-09 | 中国电子信息产业集团有限公司 | Generation method and device of attack behavior portrait, electronic equipment and storage medium |
CN112966264A (en) * | 2021-02-28 | 2021-06-15 | 新华三信息安全技术有限公司 | XSS attack detection method, device, equipment and machine-readable storage medium |
Non-Patent Citations (2)
Title |
---|
YIXIN WU 等: "GroupTracer:Automatic attacker TTP profile extraction and group cluster in Internet of things", SECURITY AND COMMUNICATION NETWORK, 4 December 2020 (2020-12-04), pages 1 - 14, XP093019322, DOI: 10.1155/2020/8842539 * |
王楠 等: "基于安全态势感知在网络攻击防御中的应用", 电信技术, no. 03, 25 March 2017 (2017-03-25), pages 86 - 88 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115550015A (en) * | 2022-09-23 | 2022-12-30 | 北京中睿天下信息技术有限公司 | An Attack Analysis Method Based on Asset Classification |
Also Published As
Publication number | Publication date |
---|---|
CN114186232B (en) | 2024-12-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12184697B2 (en) | AI-driven defensive cybersecurity strategy analysis and recommendation system | |
CN111786950B (en) | Network security monitoring method, device, equipment and medium based on situation awareness | |
US10635817B2 (en) | Targeted security alerts | |
US10289838B2 (en) | Scoring for threat observables | |
CN110417772A (en) | The analysis method and device of attack, storage medium, electronic device | |
CN113726780B (en) | Network monitoring method and device based on situation awareness and electronic equipment | |
JP2015076863A (en) | Log analysis apparatus, method and program | |
CN105009132A (en) | Event correlation based on confidence factor | |
CN112153062B (en) | Multi-dimension-based suspicious terminal equipment detection method and system | |
CN112995236B (en) | Internet of things equipment safety management and control method, device and system | |
EP3172692A1 (en) | Remedial action for release of threat data | |
WO2015160357A1 (en) | Rating threat submitter | |
CN108234426B (en) | APT attack warning method and APT attack warning device | |
CN105825130B (en) | A kind of information security early warning method and device | |
CN112422513A (en) | An Anomaly Detection and Attack Initiator Analysis System Based on Network Traffic Packets | |
CN115603995A (en) | Information processing method, device, equipment and computer readable storage medium | |
CN114186232B (en) | A network attack team identification method, device, electronic device and storage medium | |
CN118487872B (en) | Nuclear power industry-oriented network abnormal behavior detection and analysis method | |
CN115499166B (en) | Network space protection system | |
CN112560085B (en) | Privacy protection method and device for business prediction model | |
CN114816964A (en) | Risk model construction method, risk detection device and computer equipment | |
CN113506109A (en) | Fraud transaction identification method and device | |
CN118200022B (en) | Data encryption method and system based on malicious attacks on big data networks | |
CN111147497A (en) | A kind of intrusion detection method, device and equipment based on knowledge asymmetry | |
CN115967542B (en) | Intrusion detection method, device, equipment and medium based on human factor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |