[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN111314170B - Feature fuzzy P2P protocol identification method based on connection statistical rule analysis - Google Patents

Feature fuzzy P2P protocol identification method based on connection statistical rule analysis Download PDF

Info

Publication number
CN111314170B
CN111314170B CN202010049565.4A CN202010049565A CN111314170B CN 111314170 B CN111314170 B CN 111314170B CN 202010049565 A CN202010049565 A CN 202010049565A CN 111314170 B CN111314170 B CN 111314170B
Authority
CN
China
Prior art keywords
protocol
flow
detected
characteristic
elements
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010049565.4A
Other languages
Chinese (zh)
Other versions
CN111314170A (en
Inventor
石小川
刘琦
黄龙飞
张晶
刘家祥
赵昆杨
陈瑜靓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Useear Information Technology Co ltd
Original Assignee
Fujian Qidian Space Time Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Qidian Space Time Digital Technology Co ltd filed Critical Fujian Qidian Space Time Digital Technology Co ltd
Priority to CN202010049565.4A priority Critical patent/CN111314170B/en
Publication of CN111314170A publication Critical patent/CN111314170A/en
Application granted granted Critical
Publication of CN111314170B publication Critical patent/CN111314170B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/18Protocol analysers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • Computer Security & Cryptography (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

A characteristic fuzzy P2P protocol identification method based on connection statistical rule analysis comprises S1, statistically recording related data about P2P protocol; s2, extracting various characteristic elements related to the P2P protocol in the recorded related data, and establishing a P2P protocol characteristic database; s3, analyzing and calculating the frequency of the characteristic elements of various P2P protocols in the P2P protocol characteristic database according to the connection statistical rule, calculating the probability of the characteristic elements, and sorting and grading according to the probability; s4, monitoring the flow of the application host, and extracting characteristic elements of the flow to be detected; s5, matching and comparing the occurrence probability of the P2P protocol feature elements with the feature elements of the flow to be detected from large to small in a P2P protocol feature database; and S6, determining what type of protocol the traffic to be detected adopts. The invention improves the efficiency of identifying the protocol type by carrying out statistical analysis on the P2P protocol characteristic data on which characteristic has high probability of occurrence.

Description

Feature fuzzy P2P protocol identification method based on connection statistical rule analysis
Technical Field
The invention relates to the technical field of computer network security, in particular to a characteristic fuzzy P2P protocol identification method based on connection statistical rule analysis.
Background
Statistical rules, rules that work on the whole of a large number of contingencies, represent the nature and the necessary connections of the whole of these things. Probability is the basic concept of statistical law theory, which reflects the essential characteristics of a stochastic process, and characterizes the magnitude of the probability of a random event occurring, i.e., the frequency with which the event occurs over multiple iterations of the process. Protocol identification refers to identifying the type of protocol used by traffic transmitted over a network link. There are two main methods for conventional protocol identification: one is a technique for recognition based on scanning the content of the message, which is highly accurate, but does not work for applications with encrypted links and extraneous key features. Another recognition method is recognition based on a data model of statistical message information, which has high recognition efficiency but poor recognition rate. The prior art has low efficiency in detecting which protocol is used for the traffic, and many places need improvement.
Disclosure of Invention
Objects of the invention
In order to solve the technical problems in the background art, the invention provides a characteristic fuzzy P2P protocol identification method based on connection statistical rule analysis, which preferentially matches the characteristics with high probability of the detected flow by statistically analyzing the characteristic data of the P2P protocol, so that the efficiency of protocol type identification is improved, and the identification accuracy is high.
(II) technical scheme
In order to solve the above problems, the present invention provides a feature fuzzy P2P protocol identification method based on connection statistical rule analysis, comprising;
s1, counting the content and behavior of the P2P protocol;
s2, extracting content characteristics and behavior characteristics of the P2P protocol, and establishing a P2P protocol characteristic database;
s3, statistically calculating the probability of occurrence of various characteristics of the P2P protocol, and carrying out sequencing and grading according to the probability;
s4, monitoring the flow of the host, and extracting the content and the behavior characteristics of the flow to be detected;
s5, matching the content and the behavior characteristics of the flow to be detected with a P2P protocol characteristic database from large to small according to the occurrence probability of the P2P protocol characteristics in the P2P protocol characteristic database;
and S6, determining the type of the protocol adopted by the flow to be detected.
Preferably, if the flow to be detected adopts the P2P protocol, the P2P protocol characteristic data is recorded.
Preferably, the P2P protocol characteristic data in the flow to be detected is recorded and recorded for multiple times, and the probability of occurrence of various characteristics is calculated; and preferentially matching the characteristic data of the flow to be detected with the data characteristics with high occurrence probability of the characteristic data of the P2P protocol according to the probability distribution.
Preferably, the flow of the host computer is monitored, the flow to be detected is matched with the protocol feature library, and if the flow to be detected is not matched with the protocol content features and the behavior recognition features in the feature database, the flow to be detected is marked as unknown application.
Preferably, after the identification and matching of the flow to be detected adopt a P2P protocol, the flow to be detected is controlled.
Preferably, a message length threshold value is preset, whether the number of the to-be-detected flow messages in the flow monitoring state is greater than the message length threshold value is judged, and if not, the flow monitoring is continued; if so, traffic monitoring thereof is abandoned and the application is ultimately identified as one employing a P2P-like protocol.
Preferably, when the content characteristics and the behavior characteristics of the P2P protocol include the IP address, the port number and the protocol number of the traffic to be detected; and outputting the extraction result in a log mode.
Preferably, the data of the local P2P protocol feature database is uploaded to the cloud periodically.
Preferably, repeated P2P protocol feature matching detection is performed on the detected flow after detection is completed.
Preferably, if the flow to be detected is detected without adopting the P2P protocol, the content and the behavior characteristics are recorded, and a database is established by utilizing the content and the behavior characteristics.
In the invention, the content and the behavior of the P2P protocol are counted; recording relevant data about the P2P protocol through statistics; extracting various characteristic elements related to the P2P protocol in the recorded related data, and establishing a P2P protocol characteristic database; analyzing and calculating the frequency of the occurrence of various characteristic elements related to the P2P protocol in a P2P protocol characteristic database according to a connection statistical rule, calculating the probability of the occurrence of various element characteristics, and carrying out sequencing and grading according to the probability; monitoring the flow of the application host, and extracting characteristic elements of the flow to be detected; in a P2P protocol feature database, matching and comparing with the feature elements of the flow to be detected according to the increasing probability of the occurrence of the P2P protocol feature elements; it is determined what type of protocol is used for the flow to be detected. The invention analyzes the rule of high probability of occurrence of the characteristic by counting the P2P protocol characteristic data, preferentially matches the characteristic with high probability of the flow to be detected, improves the efficiency of protocol type identification and has high identification accuracy.
In the invention, after the detection is finished, if the protocol adopted by the flow to be detected is P2P protocol, the characteristic elements of the flow are recorded; according to the probability distribution, matching detection and identification are carried out, the efficiency of protocol type identification is improved, the identification accuracy is high, the host flow is monitored, and the flow management of the flow to be detected is facilitated; the flow to be detected is controlled, so that the flow to be detected can be conveniently identified and detected; the extraction result is output in a log mode, so that the method is simple and straight and convenient to obtain the result; the data of the local P2P protocol feature database is uploaded to the cloud periodically, so that the data loss is avoided; repeated P2P protocol feature matching detection is carried out on the detected flow to be detected, so that the identification accuracy is improved; and establishing a database by using the element characteristics, so that the subsequent identification and detection of other flows to be detected are facilitated.
Drawings
Fig. 1 is a schematic flow chart of a feature fuzzy P2P protocol identification method based on connection statistical rule analysis according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
As shown in fig. 1, the method for identifying a feature fuzzy P2P protocol based on connection statistical rule analysis according to the present invention includes;
s1, statistically recording related data about the P2P protocol;
s2, extracting various characteristic elements related to the P2P protocol in the recorded related data, and establishing a P2P protocol characteristic database;
s3, analyzing and calculating the frequency of the characteristic elements of various P2P protocols in the P2P protocol characteristic database according to the connection statistical rule, calculating the probability of the characteristic elements, and sorting and grading according to the probability;
s4, monitoring the flow of the application host, and extracting characteristic elements of the flow to be detected;
s5, matching and comparing the occurrence probability of the P2P protocol feature elements with the feature elements of the flow to be detected from large to small in a P2P protocol feature database;
and S6, determining what type of protocol the traffic to be detected adopts.
In the invention, the content and the behavior of the P2P protocol are counted; recording relevant data about the P2P protocol through statistics; extracting various characteristic elements related to the P2P protocol in the recorded related data, and establishing a P2P protocol characteristic database; analyzing and calculating the frequency of the occurrence of various characteristic elements related to the P2P protocol in a P2P protocol characteristic database according to a connection statistical rule, calculating the probability of the occurrence of various element characteristics, and carrying out sequencing and grading according to the probability; monitoring the flow of the application host, and extracting characteristic elements of the flow to be detected; in a P2P protocol feature database, matching and comparing with the feature elements of the flow to be detected according to the increasing probability of the occurrence of the P2P protocol feature elements; it is determined what type of protocol is used for the flow to be detected. The invention analyzes the rule of high probability of occurrence of the characteristic by counting the P2P protocol characteristic data, preferentially matches the characteristic with high probability of the flow to be detected, improves the efficiency of protocol type identification and has high identification accuracy.
In an alternative embodiment, after the detection is completed, if the protocol used for the flow rate to be detected is the P2P protocol, the characteristic elements are recorded.
It should be noted that the characteristic elements of the flow rate detection system are recorded, which facilitates the subsequent identification and detection of other flow rates to be detected.
In an optional embodiment, feature elements of a P2P protocol in the flow to be detected are recorded for multiple times, and the occurrence frequency of feature features is calculated; and preferentially matching the characteristic elements with the frequency highest in the characteristic elements in the flow to be detected next time with the P2P protocol characteristic database according to the frequency distribution.
It should be noted that, according to the probability distribution, matching detection and identification are performed, so that the efficiency of identifying the protocol type is improved, and the identification accuracy is high.
In an optional embodiment, the application host flow is monitored, the flow to be detected is matched with a P2P protocol feature database, and if the features of the flow to be detected are not matched with feature elements in the feature database, the flow to be detected is identified as adopting unknown protocol flow.
It should be noted that the host traffic is monitored, which is convenient for traffic management of traffic to be detected.
In an alternative embodiment, if the P2P protocol is assumed to be used after the flow to be detected is subjected to identification detection, the flow to be detected is controlled.
It should be noted that, the flow to be detected is subjected to flow control, so that the flow to be detected is conveniently identified and detected.
In an optional embodiment, setting a maximum value of the message length, and judging whether the number of the to-be-detected flow messages in a flow monitoring state is greater than the maximum value of the message length; if not, continuing to monitor the flow; if so, then traffic monitoring is abandoned and the traffic is considered as traffic using a protocol similar to P2P.
It should be noted that, it is convenient to manage the traffic to be detected.
In an optional embodiment, the content features and behavior features extracted from the P2P protocol include an IP address, a port number, and a message length of the traffic to be detected; the extraction result is output in a working log mode.
It should be noted that the extraction result is output in a log manner, which is simple and convenient for obtaining the result.
In an alternative embodiment, the relevant data of the local P2P protocol feature database is uploaded to the cloud periodically.
It should be noted that, data in the local P2P protocol feature database is uploaded to the cloud periodically, so as to avoid data loss.
In an alternative embodiment, the method is characterized in that repeated P2P protocol feature matching detection is carried out on the detected traffic which is detected.
It should be noted that, repeated P2P protocol feature matching detection is performed on the detected traffic to be detected, so as to improve the identification accuracy.
In an optional embodiment, if the flow to be detected is detected not to adopt the P2P protocol, recording characteristic elements of the flow; and establishing a database by using the element characteristics.
It should be noted that the database is established by using the feature characteristics, which is convenient for the subsequent identification and detection of other flows to be detected.
It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.

Claims (5)

1. A characteristic fuzzy P2P protocol identification method based on connection statistical rule analysis is characterized by comprising the following steps;
s1, statistically recording related data about the P2P protocol;
s2, extracting various characteristic elements related to the P2P protocol in the recorded related data, and establishing a P2P protocol characteristic database;
s3, analyzing and calculating the frequency of the characteristic elements of various P2P protocols in the P2P protocol characteristic database according to the connection statistical rule, calculating the probability of the characteristic elements, and sorting and grading according to the probability; setting the maximum value of the message length, and judging whether the number of the to-be-detected flow messages in a flow monitoring state is larger than the maximum value of the message length; if not, continuing to monitor the flow; if so, abandoning the flow monitoring of the network node and regarding the flow as the flow adopting the protocol similar to P2P;
s4, monitoring the flow of the application host, and extracting characteristic elements of the flow to be detected; monitoring the flow of an application host, matching the flow to be detected with a P2P protocol feature database, and if the features of the flow to be detected are not matched with feature elements in the feature database, marking the flow to be detected as adopting unknown protocol flow;
s5, matching and comparing the occurrence probability of the P2P protocol feature elements with the feature elements of the flow to be detected from large to small in a P2P protocol feature database;
s6, determining what type of protocol the flow to be detected adopts; if the flow to be detected is identified and detected, and a P2P protocol is adopted, the flow to be detected is controlled;
after the detection is finished, if the protocol adopted by the flow to be detected is a P2P protocol, recording the characteristic elements of the flow; recording characteristic elements of a P2P protocol in the flow to be detected for multiple times and calculating the occurrence frequency of the characteristic elements; and preferentially matching the characteristic elements with the frequency highest in the characteristic elements in the flow to be detected next time with the P2P protocol characteristic database according to the frequency distribution.
2. The method for recognizing the fuzzy feature P2P protocol based on the analysis of the connection statistical rules according to claim 1, wherein the content features and behavior features of the extracted P2P protocol include IP address, port number and message length of the traffic to be detected; the extraction result is output in a working log mode.
3. The method for identifying the fuzzy P2P protocol based on the statistical rules analysis of connection as claimed in claim 1, wherein the data related to the local P2P protocol feature database is uploaded to the cloud periodically.
4. The method for identifying the fuzzy P2P protocol based on the statistical rules of connectivity analysis as claimed in claim 1, wherein the detected traffic is repeatedly detected by matching the characteristics of the P2P protocol.
5. The method for recognizing the characteristic fuzzy P2P protocol based on the statistical regularity of connection analysis as claimed in claim 1, wherein if the flow to be detected detects that the P2P protocol is not adopted, the characteristic elements are recorded; and establishing a database by using the element characteristics.
CN202010049565.4A 2020-01-16 2020-01-16 Feature fuzzy P2P protocol identification method based on connection statistical rule analysis Active CN111314170B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010049565.4A CN111314170B (en) 2020-01-16 2020-01-16 Feature fuzzy P2P protocol identification method based on connection statistical rule analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010049565.4A CN111314170B (en) 2020-01-16 2020-01-16 Feature fuzzy P2P protocol identification method based on connection statistical rule analysis

Publications (2)

Publication Number Publication Date
CN111314170A CN111314170A (en) 2020-06-19
CN111314170B true CN111314170B (en) 2021-12-03

Family

ID=71146779

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010049565.4A Active CN111314170B (en) 2020-01-16 2020-01-16 Feature fuzzy P2P protocol identification method based on connection statistical rule analysis

Country Status (1)

Country Link
CN (1) CN111314170B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101030835A (en) * 2007-02-09 2007-09-05 华为技术有限公司 Apparatus and method for obtaining detection characteristics
CN102045347A (en) * 2010-11-30 2011-05-04 华为技术有限公司 Method and device for identifying protocol
CN102420833A (en) * 2011-12-27 2012-04-18 华为技术有限公司 Method, device and system for identifying network protocol

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7068789B2 (en) * 2001-09-19 2006-06-27 Microsoft Corporation Peer-to-peer name resolution protocol (PNRP) group security infrastructure and method
CN101282251B (en) * 2008-05-08 2011-04-13 中国科学院计算技术研究所 Method for digging recognition characteristic of application layer protocol
CN102315974B (en) * 2011-10-17 2014-08-27 北京邮电大学 Stratification characteristic analysis-based method and apparatus thereof for on-line identification for TCP, UDP flows
US9571511B2 (en) * 2013-06-14 2017-02-14 Damballa, Inc. Systems and methods for traffic classification

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101030835A (en) * 2007-02-09 2007-09-05 华为技术有限公司 Apparatus and method for obtaining detection characteristics
CN102045347A (en) * 2010-11-30 2011-05-04 华为技术有限公司 Method and device for identifying protocol
CN102420833A (en) * 2011-12-27 2012-04-18 华为技术有限公司 Method, device and system for identifying network protocol

Also Published As

Publication number Publication date
CN111314170A (en) 2020-06-19

Similar Documents

Publication Publication Date Title
CN110505179B (en) Method and system for detecting network abnormal flow
CN111885059B (en) Method for detecting and positioning abnormal industrial network flow
CN111191767B (en) Vectorization-based malicious traffic attack type judging method
CN113645232B (en) Intelligent flow monitoring method, system and storage medium for industrial Internet
WO2011050545A1 (en) Automatic analysis method for unknown application layer protocols
CN109218321A (en) A kind of network inbreak detection method and system
CN113378899B (en) Abnormal account identification method, device, equipment and storage medium
CN111191720B (en) Service scene identification method and device and electronic equipment
CN114143037B (en) Malicious encrypted channel detection method based on process behavior analysis
CN110851422A (en) Data anomaly monitoring model construction method based on machine learning
CN106330611A (en) Anonymous protocol classification method based on statistical feature classification
CN113706100B (en) Real-time detection and identification method and system for Internet of things terminal equipment of power distribution network
CN105959321A (en) Passive identification method and apparatus for network remote host operation system
CN115277113A (en) Power grid network intrusion event detection and identification method based on ensemble learning
CN114553591A (en) Training method of random forest model, abnormal flow detection method and device
CN113205134A (en) Network security situation prediction method and system
CN113259367B (en) Industrial control network flow multistage anomaly detection method and device
CN111314170B (en) Feature fuzzy P2P protocol identification method based on connection statistical rule analysis
CN102098346B (en) Method for identifying flow of P2P (peer-to-peer) stream media in unknown flow
CN109194622B (en) Encrypted flow analysis feature selection method based on feature efficiency
CN116405261A (en) Malicious flow detection method, system and storage medium based on deep learning
KR102470364B1 (en) A method for generating security event traning data and an apparatus for generating security event traning data
CN113159992A (en) Method and device for classifying behavior patterns of closed-source power engineering control system
CN111274235A (en) Unknown protocol data cleaning and protocol field feature extraction method
CN117411703B (en) Modbus protocol-oriented industrial control network abnormal flow detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220909

Address after: 361000 units 1702 and 1703, No. 59, Chengyi North Street, phase III, software park, Xiamen, Fujian

Patentee after: XIAMEN USEEAR INFORMATION TECHNOLOGY Co.,Ltd.

Address before: Unit 1701, unit 1704, No. 59, Chengyi North Street, phase III, software park, Xiamen City, Fujian Province, 361000

Patentee before: FUJIAN QIDIAN SPACE-TIME DIGITAL TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right