CN117650935A - Interference flow identification method based on service application classification model - Google Patents
Interference flow identification method based on service application classification model Download PDFInfo
- Publication number
- CN117650935A CN117650935A CN202311677029.9A CN202311677029A CN117650935A CN 117650935 A CN117650935 A CN 117650935A CN 202311677029 A CN202311677029 A CN 202311677029A CN 117650935 A CN117650935 A CN 117650935A
- Authority
- CN
- China
- Prior art keywords
- model
- flow
- sample
- training
- service application
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000013145 classification model Methods 0.000 title claims abstract description 19
- 238000012360 testing method Methods 0.000 claims abstract description 77
- 238000012549 training Methods 0.000 claims abstract description 71
- 238000011156 evaluation Methods 0.000 claims abstract description 16
- 238000010801 machine learning Methods 0.000 claims abstract description 14
- 238000012795 verification Methods 0.000 claims description 22
- 230000005540 biological transmission Effects 0.000 claims description 20
- 230000000694 effects Effects 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 5
- 238000007637 random forest analysis Methods 0.000 description 5
- 238000013136 deep learning model Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0895—Weakly supervised learning, e.g. semi-supervised or self-supervised learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/02—Capturing of monitoring data
- H04L43/026—Capturing of monitoring data using flow identification
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/12—Network monitoring probes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/40—Network security protocols
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Security & Cryptography (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Environmental & Geological Engineering (AREA)
- Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses an interference flow identification method based on a service application classification model, which belongs to the technical field of broadband networks and is characterized by comprising the following steps: a. collecting a pure data set 1, splitting according to flow, carrying out feature statistics to form a 54-length feature vector, and training all samples as training data to obtain an initialized model A; b. setting up a test acquisition environment; c. training the initialized model B through a training sample, inputting a test sample into the trained model B to obtain an output test label, and evaluating a test result through a machine learning evaluation index; d. the original flow data is formed into 54 long-characteristic model input data according to the session flow, and the model input data is input into a model C to obtain a large-class application label or an interference flow label. The invention can distinguish the interference flow from the encryption flow, accurately identify the normal service application and the interference flow, and improve the classification accuracy and the model generalization performance.
Description
Technical Field
The invention relates to the technical field of broadband networks, in particular to an interference flow identification method based on a service application classification model.
Background
Along with the gradual popularization of analysis research in academia and engineering world in the field of internet service application identification, accurate identification of internet service application has an increasingly important meaning to industry researchers and related enterprises, and the enterprises can be helped to acquire user access application conditions at lower cost by identifying corresponding application labels from massive and easily acquired network original flow byte stream data, so that corresponding marketing and user care strategies are formulated.
Because the security level of the internet is greatly improved, most of network traffic is acquired and encrypted, an AI model is usually adopted at present to identify application labels from side channel information of traffic disclosure, and original traffic data formed by a certain user accessing a certain specific application generally contains more traffic sessions which do not belong to the application, wherein the traffic sessions are interference traffic of the application, are general application data or general session connection processes which cannot reflect the characteristics of the application, such as TCP handshake data, the interference traffic data can reduce the accuracy and generalization of the AI model for identifying the application to a great extent, and the method has important significance in identifying and rejecting interference traffic while identifying service application.
The Chinese patent document with publication number of CN116827875A and publication date of 2023, 09 and 29 discloses an APP flow identification and denoising method based on multi-model decision, which comprises the following steps:
step 1: collecting network traffic, removing data corresponding to a non-main IP sub-network, and generating a label set Y for the network traffic according to the APP name;
step 2: extracting network flow characteristics, wherein the network flow characteristics are statistical characteristics extracted from the network flow, and extracting statistical characteristics of message length and time-related characteristics of message length aiming at forward and reverse messages, and preprocessing to form a feature vector set X of the flow;
step 3: after the flow characteristics in the Y and the X are in one-to-one correspondence with the labels, inputting a plurality of classifiers for training, and adjusting parameters of each model according to classification precision on a test set to generate a multi-model decision group { M1, M2 };
step 4: inputting the given network flow after the processing of the step 2 into the multi-model decision group trained in the step 3, and generating a classification result { A1, A2.,.
Step 5: and (3) making a decision according to the classification result obtained in the step (4), and deciding to output the APP result or considering the APP result as noise, namely discarding the result.
Compared with the traditional port matching and deep packet detection, the APP flow identification and denoising method based on multi-model decision disclosed in the patent document has strong universality and simple realization, supports most of APP identification at present, and can remove interference caused by noise in network flows. However, the accuracy of identifying and distinguishing the interference traffic contained in the encrypted HTTPS service application is low, and the classification accuracy and model generalization are poor.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides an interference flow identification method based on a service application classification model, which can distinguish interference flow from encrypted flow, accurately identify normal service application and interference flow, and improve classification accuracy and model generalization performance.
The invention is realized by the following technical scheme:
the interference flow identification method based on the service application classification model is characterized by comprising the following steps:
a. collecting a pure data set 1, splitting according to streams, respectively extracting the first 8 packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service, performing feature statistics to form a 54-length feature vector, wherein each feature vector corresponds to one sample, and training all samples as training data to obtain an initialized model A;
b. constructing a test acquisition environment, and using a packet capturing tool to test and capture packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service application labels to form a verification sample data set 2, and inputting a test sample serving as an initialized model A into the initialized model A for verification;
c. randomly splitting a training sample and a test sample according to the sample, training an initialized model B through the training sample, inputting the trained model B into the test sample to obtain an output test label, and evaluating a test result through a machine learning evaluation index;
d. and training the initialized model C through a training sample, forming 54 long-characteristic model input data of the original flow data according to the session flow, and inputting the model C to obtain a large-class application label or an interference flow label.
And e, providing the obtained large-class application labels or interference flow labels with quasi-real-time flow identification result data through a real-time unified interface.
In the step a, extracting the first 8 packets refers to extracting according to the service application performance characteristics.
In the step a, the clean data set 1 refers to a data set without interference flow.
In the step a, all samples refer to the number of pure streams.
In the step b, inputting the initialized model a as the test sample of the initialized model a for verification specifically means that the flow label with correct classification is kept unchanged, and the flow label with incorrect classification is replaced with the interference flow.
In the step c, the machine learning evaluation index comprises accuracy, recall, F1 score and confusion matrix.
In the step c, the evaluation of the test result specifically means that the interference flow identification effect is analyzed, if the effect is good, the step d is carried out, otherwise, the step a is returned, the sample size is increased or the super parameters of the initialized model A are changed.
The initialized model A, model B and model C are random forest models.
The F1 score refers to a harmonic mean of the precision and recall.
The basic principle of the invention is as follows:
the interference flow identification is essentially a process of semi-supervised learning conversion, and the identification of the interference flow is realized by identifying another batch of marked application data containing unlabeled interference flow through a trained model A and adding an interference flow label, and adding the label to the interference flow so as to convert the semi-supervised learning into supervised learning. After the interference flow label is acquired, the total data combined with the current network sample data can be used for training an interference flow identification model and can be independently provided as an interference flow data set.
Because only the side channel information of the original traffic is used, and a domain name field related to the privacy information of the user is not needed, the interference traffic can be distinguished from the encrypted traffic, and the security is better.
The beneficial effects of the invention are mainly shown in the following aspects:
1. the method comprises the steps of a, collecting a pure data set 1, splitting according to streams, respectively extracting the first 8 packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service, performing feature statistics to form a 54-length feature vector, wherein each feature vector corresponds to one sample, and training all samples as training data to obtain an initialized model A; b. constructing a test acquisition environment, and using a packet capturing tool to test and capture packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service application labels to form a verification sample data set 2, and inputting a test sample serving as an initialized model A into the initialized model A for verification; c. randomly splitting a training sample and a test sample according to the sample, training an initialized model B through the training sample, inputting the trained model B into the test sample to obtain an output test label, and evaluating a test result through a machine learning evaluation index; d. the initialized model C is trained through a training sample, the original flow data is formed into 54-long-characteristic model input data according to the session flow, the model C is input to obtain a large-class application label or an interference flow label, and compared with the prior art, the method and the device can distinguish the interference flow from the encrypted flow, accurately identify the normal service application and the interference flow, and improve the classification accuracy and the model generalization performance.
2. The invention can eliminate the interference flow so as to purify the network flow, and is convenient for establishing a more accurate service application scene flow data set.
3. The invention only collects the first 8 messages of each session, so the invention has no influence on the user service basically.
4. The model is a random forest model, but not a deep learning model, and has the characteristics of small parameter, high training and predicting reasoning speed and easiness in deployment on a platform.
5. The invention only uses the side channel information of the original flow, such as the packet length and the packet arrival time interval, and can distinguish the interference flow from the encrypted flow without using the domain name field related to the user privacy information, and has better security.
Drawings
The invention will be further specifically described with reference to the drawings and detailed description below:
FIG. 1 is a flow chart of the present invention.
Detailed Description
Example 1
Referring to fig. 1, an interference flow identification method based on a service application classification model includes the following steps:
a. collecting a pure data set 1, splitting according to streams, respectively extracting the first 8 packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service, performing feature statistics to form a 54-length feature vector, wherein each feature vector corresponds to one sample, and training all samples as training data to obtain an initialized model A;
b. constructing a test acquisition environment, and using a packet capturing tool to test and capture packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service application labels to form a verification sample data set 2, and inputting a test sample serving as an initialized model A into the initialized model A for verification;
c. randomly splitting a training sample and a test sample according to the sample, training an initialized model B through the training sample, inputting the trained model B into the test sample to obtain an output test label, and evaluating a test result through a machine learning evaluation index;
d. and training the initialized model C through a training sample, forming 54 long-characteristic model input data of the original flow data according to the session flow, and inputting the model C to obtain a large-class application label or an interference flow label.
The embodiment is a most basic implementation mode, a, collecting a pure data set 1 and splitting according to streams, wherein the first 8 packets are respectively extracted from instant messaging, network transmission, network storage, network games, network video, web browsing and mail service, characteristic statistics is carried out to form a 54-length characteristic vector, each characteristic vector corresponds to one sample, and all samples are used as a model A for training and initializing well; b. constructing a test acquisition environment, and using a packet capturing tool to test and capture packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service application labels to form a verification sample data set 2, and inputting a test sample serving as an initialized model A into the initialized model A for verification; c. randomly splitting a training sample and a test sample according to the sample, training an initialized model B through the training sample, inputting the trained model B into the test sample to obtain an output test label, and evaluating a test result through a machine learning evaluation index; d. the initialized model C is trained through a training sample, the original flow data is formed into 54-long-characteristic model input data according to the session flow, the model C is input to obtain a large-class application label or an interference flow label, and compared with the prior art, the method and the device can distinguish the interference flow from the encrypted flow, accurately identify the normal service application and the interference flow, and improve the classification accuracy and the model generalization performance.
Example 2
Referring to fig. 1, an interference flow identification method based on a service application classification model includes the following steps:
a. collecting a pure data set 1, splitting according to streams, respectively extracting the first 8 packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service, performing feature statistics to form a 54-length feature vector, wherein each feature vector corresponds to one sample, and training all samples as training data to obtain an initialized model A;
b. constructing a test acquisition environment, and using a packet capturing tool to test and capture packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service application labels to form a verification sample data set 2, and inputting a test sample serving as an initialized model A into the initialized model A for verification;
c. randomly splitting a training sample and a test sample according to the sample, training an initialized model B through the training sample, inputting the trained model B into the test sample to obtain an output test label, and evaluating a test result through a machine learning evaluation index;
d. and training the initialized model C through a training sample, forming 54 long-characteristic model input data of the original flow data according to the session flow, and inputting the model C to obtain a large-class application label or an interference flow label.
Preferably, the method further comprises step e of providing the obtained large-class application tag or interference flow tag with quasi-real-time flow identification result data through a real-time unified interface.
The embodiment is a preferred implementation manner, and can remove interference flow so as to purify network flow, thereby facilitating establishment of a more accurate service application scene flow data set.
Example 3
Referring to fig. 1, an interference flow identification method based on a service application classification model includes the following steps:
a. collecting a pure data set 1, splitting according to streams, respectively extracting the first 8 packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service, performing feature statistics to form a 54-length feature vector, wherein each feature vector corresponds to one sample, and training all samples as training data to obtain an initialized model A;
b. constructing a test acquisition environment, and using a packet capturing tool to test and capture packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service application labels to form a verification sample data set 2, and inputting a test sample serving as an initialized model A into the initialized model A for verification;
c. randomly splitting a training sample and a test sample according to the sample, training an initialized model B through the training sample, inputting the trained model B into the test sample to obtain an output test label, and evaluating a test result through a machine learning evaluation index;
d. and training the initialized model C through a training sample, forming 54 long-characteristic model input data of the original flow data according to the session flow, and inputting the model C to obtain a large-class application label or an interference flow label.
And e, providing the obtained large-class application labels or interference flow labels with quasi-real-time flow identification result data through a real-time unified interface.
Further preferably, in the step a, extracting the first 8 packets refers to extracting according to the performance characteristics of the service application.
In the step a, the clean data set 1 refers to a data set without interference flow.
In the step a, all samples refer to the number of pure streams.
This embodiment is a further preferred embodiment, and since only the first 8 messages of each session are collected, there is basically no impact on the user traffic.
Example 4
Referring to fig. 1, an interference flow identification method based on a service application classification model includes the following steps:
a. collecting a pure data set 1, splitting according to streams, respectively extracting the first 8 packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service, performing feature statistics to form a 54-length feature vector, wherein each feature vector corresponds to one sample, and training all samples as training data to obtain an initialized model A;
b. constructing a test acquisition environment, and using a packet capturing tool to test and capture packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service application labels to form a verification sample data set 2, and inputting a test sample serving as an initialized model A into the initialized model A for verification;
c. randomly splitting a training sample and a test sample according to the sample, training an initialized model B through the training sample, inputting the trained model B into the test sample to obtain an output test label, and evaluating a test result through a machine learning evaluation index;
d. and training the initialized model C through a training sample, forming 54 long-characteristic model input data of the original flow data according to the session flow, and inputting the model C to obtain a large-class application label or an interference flow label.
And e, providing the obtained large-class application labels or interference flow labels with quasi-real-time flow identification result data through a real-time unified interface.
In the step a, extracting the first 8 packets refers to extracting according to the service application performance characteristics.
In the step a, the clean data set 1 refers to a data set without interference flow.
In the step a, all samples refer to the number of pure streams.
Still more preferably, in the step b, the step of inputting the initialized model a as the test sample of the initialized model a for verification means specifically that the flow label with correct classification is kept unchanged, and the flow label with incorrect classification is replaced with the interference flow.
The embodiment is a preferred implementation mode, and the models are random forest models instead of deep learning models, and have the characteristics of small parameter quantity, high training and prediction reasoning speed and easiness in deployment on a platform.
Example 5
Referring to fig. 1, an interference flow identification method based on a service application classification model includes the following steps:
a. collecting a pure data set 1, splitting according to streams, respectively extracting the first 8 packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service, performing feature statistics to form a 54-length feature vector, wherein each feature vector corresponds to one sample, and training all samples as training data to obtain an initialized model A;
b. constructing a test acquisition environment, and using a packet capturing tool to test and capture packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service application labels to form a verification sample data set 2, and inputting a test sample serving as an initialized model A into the initialized model A for verification;
c. randomly splitting a training sample and a test sample according to the sample, training an initialized model B through the training sample, inputting the trained model B into the test sample to obtain an output test label, and evaluating a test result through a machine learning evaluation index;
d. and training the initialized model C through a training sample, forming 54 long-characteristic model input data of the original flow data according to the session flow, and inputting the model C to obtain a large-class application label or an interference flow label.
And e, providing the obtained large-class application labels or interference flow labels with quasi-real-time flow identification result data through a real-time unified interface.
In the step a, extracting the first 8 packets refers to extracting according to the service application performance characteristics.
In the step a, the clean data set 1 refers to a data set without interference flow.
In the step a, all samples refer to the number of pure streams.
In the step b, inputting the initialized model a as the test sample of the initialized model a for verification specifically means that the flow label with correct classification is kept unchanged, and the flow label with incorrect classification is replaced with the interference flow.
In the step c, the machine learning evaluation index comprises accuracy, recall, F1 score and confusion matrix.
In the step c, the evaluation of the test result specifically means that the interference flow identification effect is analyzed, if the effect is good, the step d is carried out, otherwise, the step a is returned, the sample size is increased or the super parameters of the initialized model A are changed.
In this embodiment, as a best mode, only the side channel information of the original traffic, such as the packet length and the packet arrival duration interval, is used, and the domain name field related to the user privacy information is not needed, so that the interference traffic can be distinguished from the encrypted traffic, and meanwhile, the security is better.
The invention is experimentally verified as follows:
the pure data set 1 of the model A is derived from 7 major categories of data collected by a broadband access server on the existing network, including instant messaging, network transmission, network storage, network games, network video, web browsing and mail service, and 17.4GB of data is generated into 84292 sessions after sample equalization processing, and 84292 samples are corresponding. The input model A is used for adding the flow data of the interference flow label and is derived from 5.9GB data of 7 major classes collected by starting a packet grabbing tool when the application is directly started at the PC end, each major class comprises 600-1000MB of stored data messages, 41563 sessions are generated, and 41563 samples are correspondingly generated.
The model adopts a random forest classification model, and the parameters of the random forest model are default parameters by directly calling a machine learning package.
The accuracy of the prediction classification of the model A using the branch acquisition data in the second step is 88.89%, the classification and identification errors of 3682 samples are totally detected, the samples are uniformly changed into interference flow, the samples are combined with the current network sample in the first step, the combined total sample size is 125855, the samples are randomly divided into training samples and testing samples, after the training sample is used for training the model B, the accuracy of the prediction classification of the model B in the testing sample is 90.61%, and the application type sample size is shown in the table 1.
The sample distribution, confusion matrix of model a predictors, confusion matrix of model B predictors, accuracy P of model B predictions for each category, and recall R, F1 score F1 are shown in table 2.
TABLE 1
TABLE 2
General application | P | R | F1 |
Instant messaging | 0.928 | 0.8845 | 0.9057 |
Network transmission | 0.9143 | 0.9249 | 0.9195 |
Network storage | 0.9782 | 0.9874 | 0.9828 |
Network game | 0.8361 | 0.7612 | 0.7969 |
Network video | 0.9291 | 0.8429 | 0.8839 |
Web browsing | 0.7719 | 0.898 | 0.8302 |
Mail service | 0.9845 | 0.9922 | 0.9883 |
Interference flow | 0.8019 | 0.9043 | 0.85 |
From table 2, it can be seen that the accuracy, recall rate and F1 fraction of the interference flow class are all above 80%, which proves that the invention is effective for identifying the interference flow.
Claims (8)
1. The interference flow identification method based on the service application classification model is characterized by comprising the following steps:
a. collecting a pure data set 1, splitting according to streams, respectively extracting the first 8 packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service, performing feature statistics to form a 54-length feature vector, wherein each feature vector corresponds to one sample, and training all samples as training data to obtain an initialized model A;
b. constructing a test acquisition environment, and using a packet capturing tool to test and capture packets of instant messaging, network transmission, network storage, network games, network video, web browsing and mail service application labels to form a verification sample data set 2, and inputting a test sample serving as an initialized model A into the initialized model A for verification;
c. randomly splitting a training sample and a test sample according to the sample, training an initialized model B through the training sample, inputting the trained model B into the test sample to obtain an output test label, and evaluating a test result through a machine learning evaluation index;
d. and training the initialized model C through a training sample, forming 54 long-characteristic model input data of the original flow data according to the session flow, and inputting the model C to obtain a large-class application label or an interference flow label.
2. The interference flow identification method based on the service application classification model according to claim 1, wherein: and e, providing the obtained large-class application labels or interference flow labels with quasi-real-time flow identification result data through a real-time unified interface.
3. The interference flow identification method based on the service application classification model according to claim 1, wherein: in the step a, extracting the first 8 packets refers to extracting according to the service application performance characteristics.
4. The interference flow identification method based on the service application classification model according to claim 1, wherein: in the step a, the clean data set 1 refers to a data set without interference flow.
5. The interference flow identification method based on the service application classification model according to claim 1, wherein: in the step a, all samples refer to the number of pure streams.
6. The interference flow identification method based on the service application classification model according to claim 1, wherein: in the step b, inputting the initialized model a as the test sample of the initialized model a for verification specifically means that the flow label with correct classification is kept unchanged, and the flow label with incorrect classification is replaced with the interference flow.
7. The interference flow identification method based on the service application classification model according to claim 1, wherein: in the step c, the machine learning evaluation index comprises accuracy, recall, F1 score and confusion matrix.
8. The interference flow identification method based on the service application classification model according to claim 1, wherein: in the step c, the evaluation of the test result specifically means that the interference flow identification effect is analyzed, if the effect is good, the step d is carried out, otherwise, the step a is returned, the sample size is increased or the super parameters of the initialized model A are changed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311677029.9A CN117650935A (en) | 2023-12-08 | 2023-12-08 | Interference flow identification method based on service application classification model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311677029.9A CN117650935A (en) | 2023-12-08 | 2023-12-08 | Interference flow identification method based on service application classification model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117650935A true CN117650935A (en) | 2024-03-05 |
Family
ID=90049223
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311677029.9A Pending CN117650935A (en) | 2023-12-08 | 2023-12-08 | Interference flow identification method based on service application classification model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117650935A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118171196A (en) * | 2024-05-16 | 2024-06-11 | 电子科技大学 | Video service customer experience quality prediction method based on optical network big data |
-
2023
- 2023-12-08 CN CN202311677029.9A patent/CN117650935A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118171196A (en) * | 2024-05-16 | 2024-06-11 | 电子科技大学 | Video service customer experience quality prediction method based on optical network big data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110391958B (en) | Method for automatically extracting and identifying characteristics of network encrypted flow | |
CN102315974B (en) | Stratification characteristic analysis-based method and apparatus thereof for on-line identification for TCP, UDP flows | |
US9201953B2 (en) | Filtering information using targeted filtering schemes | |
CN102984269B (en) | A kind of point-to-point method for recognizing flux and device | |
CN109361617A (en) | A kind of convolutional neural networks traffic classification method and system based on network payload package | |
CN105871832A (en) | Network application encrypted traffic recognition method and device based on protocol attributes | |
CN112491917B (en) | Unknown vulnerability identification method and device for Internet of things equipment | |
CN113259313A (en) | Malicious HTTPS flow intelligent analysis method based on online training algorithm | |
CN113283498B (en) | VPN flow quick identification method for high-speed network | |
CN106789242B (en) | Intelligent identification application analysis method based on mobile phone client software dynamic feature library | |
CN110868409A (en) | Passive operating system identification method and system based on TCP/IP protocol stack fingerprint | |
CN109275045B (en) | DFI-based mobile terminal encrypted video advertisement traffic identification method | |
CN106330584A (en) | Identification method and identification device of business flow | |
CN111711545A (en) | Intelligent encrypted flow identification method based on deep packet inspection technology in software defined network | |
CN108289125A (en) | TCP sessions recombination based on Stream Processing and statistical data extracting method | |
CN117650935A (en) | Interference flow identification method based on service application classification model | |
CN111600878A (en) | Low-rate denial of service attack detection method based on MAF-ADM | |
CN112019500B (en) | Encrypted traffic identification method based on deep learning and electronic device | |
CN107404398A (en) | A kind of networks congestion control judgement system | |
CN114650229A (en) | Network encryption traffic classification method and system based on three-layer model SFTF-L | |
CN111200543A (en) | Encryption protocol identification method based on active service detection engine technology | |
CN111917665A (en) | Terminal application data stream identification method and system | |
CN114884894B (en) | Semi-supervised network traffic classification method based on transfer learning | |
CN113746707B (en) | Encrypted traffic classification method based on classifier and network structure | |
CN112929364B (en) | Data leakage detection method and system based on ICMP tunnel analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |