CN118277914A - Mobile application classification method based on dynamic and static combination of multidimensional APK features - Google Patents
Mobile application classification method based on dynamic and static combination of multidimensional APK features Download PDFInfo
- Publication number
- CN118277914A CN118277914A CN202311471891.4A CN202311471891A CN118277914A CN 118277914 A CN118277914 A CN 118277914A CN 202311471891 A CN202311471891 A CN 202311471891A CN 118277914 A CN118277914 A CN 118277914A
- Authority
- CN
- China
- Prior art keywords
- app
- apk
- dynamic
- information
- static
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000003068 static effect Effects 0.000 title claims abstract description 54
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000004891 communication Methods 0.000 claims abstract description 16
- 230000006870 function Effects 0.000 claims abstract description 10
- 230000007246 mechanism Effects 0.000 claims abstract description 10
- 238000013101 initial test Methods 0.000 claims abstract description 4
- 230000004807 localization Effects 0.000 claims description 10
- 238000012216 screening Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 7
- 230000005540 biological transmission Effects 0.000 claims description 6
- 238000013461 design Methods 0.000 claims description 5
- 238000004140 cleaning Methods 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 2
- 238000012552 review Methods 0.000 claims 1
- 238000010224 classification analysis Methods 0.000 abstract description 3
- 238000012549 training Methods 0.000 description 5
- 238000011176 pooling Methods 0.000 description 4
- 238000013145 classification model Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000001502 supplementing effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000000265 homogenisation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/24765—Rule-based classification
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of APP classification analysis, and discloses a mobile application classification method based on dynamic and static combination of multidimensional APK features, which comprises the steps of firstly constructing APP features, collecting APP information based on mainstream mobile phone application stores, an Internet small distribution platform and an APP propagation page, specifically identifying the service classification of the APP by the functions provided by the APP or the information content presented by the APP, collecting communication information, and forming an initial test data set; analyzing based on the APP source codes to obtain static source code characteristics, dynamic flow and page characteristic data of the APP, wherein the static source code characteristics, the dynamic flow and the page characteristic data specifically comprise names, flow and content information; and establishing a rule matching model and a matching mechanism, specifically, establishing a timing scanning program, and identifying and judging through each preset classification rule matching model. The method has higher recognition accuracy rate for APP with obvious technical characteristics or content characteristics, and reduces the participation of manual auditing.
Description
Technical Field
The invention relates to the technical field of APP classification analysis, in particular to a mobile application classification method based on dynamic and static combination of multidimensional APK features.
Background
In the technical field of APK (Android application package file) classification analysis, remarkable development has been achieved in recent years. The method specifically comprises the following steps:
1. APK classification method based on machine learning and deep learning: (1) Inputting the App name into an Internet search engine, and processing the result to obtain an App document; (2) Extracting keyword distribution characteristics based on a vector space model, and training a base classifier by adopting a shallow learning technology on the basis of the keyword distribution characteristics; (3) Training a word vector based on word2vec, and training another base classifier by adopting a convolutional neural network on the basis; (4) And designing a cooperative learning framework, performing cooperative training on the 2 base classifiers by using the unlabeled samples, and fusing training results to obtain a final App classifier. Personalized classification of App is realized only by using the App name; only a small amount of marked samples are needed to build a classification model with higher accuracy; the designed collaborative learning framework considers the performance imbalance of different base classifiers, and can reduce the influence of noise data in unlabeled samples.
Based on APP user comment classification method: acquiring APP user comment data, and cleaning and marking; establishing SVTEO models and NBTEO models; the SVTEO model includes: extracting Encoder structural parts in a transducer model to obtain a Trasformer-Encoder-Only layer, connecting a pooling layer after Trasformer-Encoder-Only layer, and connecting a linear layer and a support vector layer in parallel after pooling layer to obtain a SVTEO model; the NBTEO model includes: extracting Encoder structural parts in a transducer model to obtain Trasformer-Encoder-Only layers, connecting a pooling layer after Trasformer-Encoder-Only layers, and connecting a linear layer and a naive Bayes layer in parallel after pooling layers to obtain a NBTEO model; the Trasformer-Encoder-Only layer comprises: embedding layers and six Encoder layers; performing machine learning, deep learning homogenization learning and SVTEO model and NBTEO model heterogeneity and parameter fine tuning treatment on linear layers of SVTEO model and NBTEO model according to label data, and forming a user comment requirement classification model by the treated SVTEO model and NBTEO model; inputting APP user comment data to be classified into a user comment demand classification model to perform classification marking processing, and obtaining a classification label of the APP user comment data.
For example, the application with the application number of "CN201810073847.0", the method for automatically classifying the APP into categories based on the keywords comprises the following steps: A. establishing a classification system, setting a corresponding keyword database, establishing a plurality of groups of keyword databases according to classification categories, and storing the plurality of groups of keyword databases in a storage module; B. the APP acquires the uploaded information content through the acquisition module and transmits the information content to the server; C. the server matches the received information content with a plurality of groups of keyword databases through a matching authentication module; D. if the information content is successfully matched with one or more groups of keyword databases, the server feeds the classification category corresponding to the keyword database which is successfully matched back to the APP, the APP divides the information content into the corresponding classification category through the execution module, and if the information content is unsuccessfully matched with a plurality of groups of keyword databases, the server feeds the information which is unsuccessfully matched back to the APP, and the APP carries out independent classification on the information content through the execution module.
The existing APP classification technology has few utilized dimensions, mainly uses few static source code features such as names, LOGO, keywords and the like to analyze, but has weak classification expression capability in the static features, and cannot fully and accurately identify the actual service classification of the APP, particularly, the APP of the types such as fake imitation, illegal content provision and the like is difficult to accurately identify through limited static source code feature dimensions.
In view of the above, there is a need for a mobile application classification method based on dynamic and static combined multi-dimensional APK features.
Disclosure of Invention
The invention aims to provide a mobile application classification method based on dynamic and static combination of multi-dimensional APK features. The method extracts the dynamic and static multidimensional characteristics of APP such as source code characteristics (including APP names, LOGO, package names, signature HASH, signature OWNER, localized configuration, layout files, authorities, service declarations, SDK and static libraries), traffic (including main service domain names, IP, paths, head information and request contents), page contents (page display contents and snapshots), extracts common technical characteristics or content characteristic combinations aiming at APP classification and identification scenes of types such as communication, designs an APP characteristic extraction method combining dynamic and static characteristics, and builds a two-stage specific type APP classification and identification model comprising a rule matching model and a grading and sorting model, thereby realizing effective classification and identification of mobile applications in massive APP data.
The invention is realized in the following way:
the invention provides a mobile application classification method based on dynamic and static combination of multi-dimensional APK features, which is specifically implemented by the following steps:
S 1, performing APP feature construction, collecting APP information based on a mainstream mobile phone application store, an Internet small distribution platform and an APP propagation page, specifically identifying the service classification of the APP by the function provided by the APP or the information content presented by the APP, collecting communication information, and forming an initial test data set;
S 2, analyzing based on the APP source code to obtain static source code characteristics, dynamic flow and page characteristic data of the APP, wherein the static source code characteristics, the dynamic flow and the page characteristic data specifically comprise names, flow and content information; the names comprise LOGO, package names, signature HASH, signature OWNER, localized configuration, layout files, rights, service statement, SDK and static library, and the traffic comprises domain names, IP, paths, header information and request content; the content specifically comprises page display content and snapshot information.
Analyzing APP source codes specifically by analyzing APK package files, constructing an information extraction function through Android design specifications and output requirements, and obtaining names, LOGO, package names, signature HASH, signature OWNER, localized configuration, layout files, rights, service statement, SDK and static library information;
and automatically installing and operating the APP in the android device environment by operating and analyzing and constructing an automatic control program, and extracting communication flow of the APP and page presentation content information. When the characteristics of the communication class are analyzed, characteristic data in the aspects of functions, rights, layout and content data are specifically included, and a rule matching rule data set and a grading sorting rule set are constructed;
According to the service characteristics or technical characteristics of the APP, configuring characteristics of different dimensions, wherein the specific characteristics of the APP comprise keywords; the communication APP is characterized by comprising address book, friends and transmission, and is provided with a data transmission, a data encryption specific SDK and a static library.
S 3, establishing a rule matching model and a matching mechanism, specifically, extracting the formatted and stored data according to an object form by constructing a timing scanning program, identifying and judging through each preset classification rule matching model, marking a classification label if any hit exists, and entering a waiting state of a scoring and sorting model; acquiring various static source code features, dynamic flow features and dynamic content features of the APP, and performing data cleaning and data formatting storage on the full data;
The static source code features comprise names, LOGO, package names, signature HASH, signature OWNER, localization configuration, layout files, authorities, service claims, SDK and static library information;
the dynamic flow characteristics comprise domain names, IP, paths, header information and request content information; the dynamic content features include, for example, page display content, page snapshots, and page layout file information.
S 4, establishing a scoring sorting model and a screening mechanism model, constructing a timing scanning program for the APP which has the service classification label and is in a state to be checked, scanning the part of APP, and filtering through rules;
And S 5, carrying out rule matching on a large number of APP by a rule matching model, carrying out branch screening by a screening mechanism model, and outputting a result to classify the APP.
Further, in step S 4, the specific rule is that keyword feature matching is performed on the APK name, the localization configuration, the domain name, the IP, the path, the header information and the request content, dataset feature matching is performed on the LOGO, the package name, the signature HASH and the signature OWNER, a score is calculated for each dimension of the APK, and the score is calculated according to the matching degree and the weight of the feature.
And for each APK, summarizing the two scores of A and B, wherein the score of the summarized APK name and the score of the localized configuration content attribute are represented by A, the score of the dimensionality of the summarized package name, the signature, the LOGO and the like are represented by B, the hit dimensionality number is recorded, A1 represents the hit dimensionality number in A, and B1 represents the hit dimensionality number in B.
Judging the APK meeting the classification requirement, if B1 is larger than 1, namely, the APK is matched with the feature rules in a plurality of dimensions of package names, signatures and logo, judging that the APK is the expected recognition classification APK;
In step S 5, if B1 is equal to 1 and A1 is greater than 1, performing manual auditing; because the APK name and localization configuration match multiple dimensional rules, further auditing is required to determine the classification. Feature rule extraction and merging: and (3) extracting the characteristic rules from the corresponding classification APK judged in the previous step so as to update and improve the rules in future, and merging the extracted characteristic rules into the rule set in the step S 4. To enrich the feature rules.
Further, the present invention provides a computer readable storage medium storing a computer program which when executed by a main controller implements a method as described in any one of the above.
Further, the core principle of the invention is that a two-layer classification recognition model is constructed through APP basic static source code characteristics, dynamic flow characteristics and dynamic content characteristics, so that the classification recognition and accurate research and judgment of the APP are realized, and the core flow comprises:
1. Extracting characteristics of a specific type of APP, wherein the characteristics comprise eighteen types of static source code characteristics (including names, LOGO, package names, signature HASH, signature OWNER, localized configuration, layout files, rights, service declarations, SDK, static libraries), dynamic flow characteristics (domain names, IP, paths, head information and request contents), content characteristics (page display contents and snapshot) and the like;
2. Constructing a rule matching model, and detecting and marking service classification in massive APP data based on the APP features extracted in the last step as the recognition matching features of the specific classification;
3. Constructing a scoring and sorting model, further identifying and studying and judging the APP found in the last step, formulating a scoring standard according to a multidimensional combination rule of static source code characteristics, dynamic flow characteristics and dynamic content characteristics, setting an output threshold value and outputting accurate classified data;
4. Optimizing and supplementing the characteristics, auditing the system marked data based on a manual mode, outputting final data, and supplementing part of the characteristic data to the step S 3-S4.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the method, under a massive APP analysis scene, the APP of each preset category can be rapidly identified.
2. The APP with obvious technical characteristics or content characteristics has high identification accuracy, and manual auditing participation is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some examples of the present invention and therefore should not be considered as limiting the scope, and that other related drawings are also obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is an overall process flow diagram of the present invention;
FIG. 2 is a flowchart of APP class execution of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, based on the embodiments of the invention, which are apparent to those of ordinary skill in the art without inventive faculty, are intended to be within the scope of the invention. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, based on the embodiments of the invention, which are apparent to those of ordinary skill in the art without inventive faculty, are intended to be within the scope of the invention.
Referring to fig. 1-2, a mobile application classification method based on dynamic and static combination of multi-dimensional APK features is specifically implemented according to the following steps:
S 1, performing APP feature construction, collecting APP information based on a mainstream mobile phone application store, an Internet small distribution platform and an APP propagation page, specifically identifying the service classification of the APP by the function provided by the APP or the information content presented by the APP, collecting communication information, and forming an initial test data set;
S 2, analyzing based on the APP source code to obtain static source code characteristics, dynamic flow and page characteristic data of the APP, wherein the static source code characteristics, the dynamic flow and the page characteristic data specifically comprise names, flow and content information; the names comprise LOGO, package names, signature HASH, signature OWNER, localized configuration, layout files, rights, service statement, SDK and static library, and the traffic comprises domain names, IP, paths, header information and request content; the content specifically comprises page display content and snapshot information.
Analyzing APP source codes specifically by analyzing APK package files, constructing an information extraction function through Android design specifications and output requirements, and obtaining names, LOGO, package names, signature HASH, signature OWNER, localized configuration, layout files, rights, service statement, SDK and static library information;
and automatically installing and operating the APP in the android device environment by operating and analyzing and constructing an automatic control program, and extracting communication flow of the APP and page presentation content information. When the characteristics of the communication class are analyzed, characteristic data in the aspects of functions, rights, layout and content data are specifically included, and a rule matching rule data set and a grading sorting rule set are constructed;
According to the service characteristics or technical characteristics of the APP, configuring characteristics of different dimensions, wherein the specific characteristics of the APP comprise keywords; the communication APP is characterized by comprising address book, friends and transmission, and is provided with a data transmission, a data encryption specific SDK and a static library.
S 3, establishing a rule matching model and a matching mechanism, specifically, extracting the formatted and stored data according to an object form by constructing a timing scanning program, identifying and judging through each preset classification rule matching model, marking a classification label if any hit exists, and entering a waiting state of a scoring and sorting model; acquiring various static source code features, dynamic flow features and dynamic content features of the APP, and performing data cleaning and data formatting storage on the full data;
The static source code features comprise names, LOGO, package names, signature HASH, signature OWNER, localization configuration, layout files, authorities, service claims, SDK and static library information;
The dynamic flow characteristics comprise domain names, IP, paths, header information and request content information; the dynamic content features include, for example, page display content, page snapshots, and page layout file information. The feature dimensions are as in table 1;
TABLE 1 static Source code feature dimension List
Dimension(s) | Matching type |
Name of the name | Accurate matching and fuzzy matching |
LOGO | Accurate matching and fuzzy matching |
Bag name | Accurate matching and fuzzy matching |
Signing | Accurate matching |
Rights | Accurate matching |
Service declaration | Accurate matching |
SDK | Accurate matching |
Static library | Accurate matching |
Localized configuration | Accurate matching and fuzzy matching |
Communication traffic domain name | Accurate matching and fuzzy matching |
Communication traffic content | Accurate matching and fuzzy matching |
S 4, establishing a scoring sorting model and a screening mechanism model, constructing a timing scanning program for the APP which has the service classification label and is in a state to be checked, scanning the part of APP, and filtering through rules;
And S 5, carrying out rule matching on a large number of APP by a rule matching model, carrying out branch screening by a screening mechanism model, and outputting a result to classify the APP.
Further, in step S 4, the specific rule is that keyword feature matching is performed on the APK name, the localization configuration, the domain name, the IP, the path, the header information and the request content, dataset feature matching is performed on the LOGO, the package name, the signature HASH and the signature OWNER, a score is calculated for each dimension of the APK, and the score is calculated according to the matching degree and the weight of the feature. The weight design is based on the technical characteristics and content characteristics of the APK, wherein the weight of the technical characteristics is higher than that of the content characteristics, the technical characteristics determine weight values according to the association degree of the characteristic attributes, and weight reference samples are shown in a table 2;
Table 2 weight reference examples
And for each APK, summarizing the two scores of A and B, wherein the score of the summarized APK name and the score of the localized configuration content attribute are represented by A, the score of the dimensionality of the summarized package name, the signature, the LOGO and the like are represented by B, the hit dimensionality number is recorded, A1 represents the hit dimensionality number in A, and B1 represents the hit dimensionality number in B.
Judging the APK meeting the classification requirement, if B1 is larger than 1, namely, the APK is matched with the feature rules in a plurality of dimensions of package names, signatures and logo, judging that the APK is the expected recognition classification APK; demarcation criteria for score calculation are shown in table 3;
TABLE 3 demarcation criteria for score calculation
In step S 5, if B1 is equal to 1 and A1 is greater than 1, performing manual auditing; because the APK name and localization configuration match multiple dimensional rules, further auditing is required to determine the classification. Feature rule extraction and merging: and (3) extracting the characteristic rules from the corresponding classification APK judged in the previous step so as to update and improve the rules in future, and merging the extracted characteristic rules into the rule set in the step S 4. To enrich the feature rules.
In this embodiment, the present invention provides a computer-readable storage medium storing a computer program which, when executed by a main controller, implements a method as described in any one of the above.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (9)
1. A mobile application classification method based on dynamic and static combination of multidimensional APK features is characterized by comprising the following steps: the method comprises the following steps:
S 1, performing APP feature construction, collecting APP information based on a mainstream mobile phone application store, an Internet small distribution platform and an APP propagation page, specifically identifying the service classification of the APP by the function provided by the APP or the information content presented by the APP, collecting communication information, and forming an initial test data set;
S 2, analyzing based on the APP source code to obtain static source code characteristics, dynamic flow and page characteristic data of the APP, wherein the static source code characteristics, the dynamic flow and the page characteristic data specifically comprise names, flow and content information;
S 3, establishing a rule matching model and a matching mechanism, specifically, extracting the formatted and stored data according to an object form by constructing a timing scanning program, identifying and judging through each preset classification rule matching model, marking a classification label if any hit exists, and entering a waiting state of a scoring and sorting model;
s 4, establishing a scoring sorting model and a screening mechanism model, constructing a timing scanning program for the APP which has the service classification label and is in a state to be checked, scanning the part of APP, and filtering through rules;
And S 5, carrying out rule matching on a large number of APP by a rule matching model, carrying out branch screening by a screening mechanism model, and outputting a result to classify the APP.
2. The mobile application classification method based on dynamic and static combination multidimensional APK feature according to claim 1, wherein in step S 2, the names include LOGO, package name, signature HASH, signature OWNER, localization configuration, layout file, rights, service declaration, SDK, static library, and the traffic includes domain name, IP, path, header information, request content; the content specifically comprises page display content and snapshot information.
3. The mobile application classification method based on dynamic and static combination multidimensional APK features according to claim 1, wherein in step S 2, the APP source code is analyzed specifically by analyzing an APK package file, and an information extraction function is built by design specifications and output requirements of Android, so as to obtain names, LOGO, package names, signature HASH, signature OWNER, localization configuration, layout file, authority, service statement, SDK and static library information;
And automatically installing and operating the APP in the android device environment by operating and analyzing and constructing an automatic control program, and extracting communication flow of the APP and page presentation content information.
4. The mobile application classification method based on dynamic and static combination multidimensional APK features according to claim 1, wherein in step S 2, when the characteristics of the communication class are analyzed, characteristic data in terms of functions, rights, layout and content data are specifically included, and a rule matching rule data set and a scoring ordering rule set are constructed;
According to the service characteristics or technical characteristics of the APP, configuring characteristics of different dimensions, wherein the specific characteristics of the APP comprise keywords; the communication APP is characterized by comprising address book, friends and transmission, and is provided with a data transmission, a data encryption specific SDK and a static library.
5. The mobile application classification method based on dynamic and static combination multidimensional APK features according to claim 1, wherein in step S 3, various static source code features, dynamic flow features and dynamic content features of APP are obtained, and data cleaning and data formatting storage are performed on the full data;
The static source code features comprise names, LOGO, package names, signature HASH, signature OWNER, localization configuration, layout files, authorities, service claims, SDK and static library information;
the dynamic flow characteristics comprise domain names, IP, paths, header information and request content information; the dynamic content features include, for example, page display content, page snapshots, and page layout file information.
6. The mobile application classification method based on dynamic and static combination multidimensional APK features according to claim 1, wherein in step S 4, specific rules are that keyword feature matching is performed on APK names, localization configuration, domain names of traffic, IP, paths, header information and request contents, dataset feature matching is performed on LOGO, package names, signature HASH and signature OWNER, scores are calculated for each dimension of APK, and the scores are calculated according to matching degree and weight of the features.
7. The mobile application classification method based on dynamic and static combination multidimensional APK features according to claim 6, wherein for each APK, the score and dimension are obtained by summarizing two scores of a and B, wherein the score of summarized APK name and localized configuration content attribute is represented by a, the score of summarized package name, signature, LOGO and other dimensions is represented by B, and the hit dimension number is recorded at the same time, A1 represents hit dimension number in a, and B1 represents hit dimension number in B;
Judging the APK meeting the classification requirement, if B1 is larger than 1, namely, the APK is matched with the feature rules in a plurality of dimensions of package names, signatures and logo, judging that the APK is the expected recognition classification APK;
In step S 5, if B1 is equal to 1 and A1 is greater than 1, then a manual review process is performed.
8. The mobile application classification method based on dynamic and static combination multi-dimensional APK features as claimed in claim 7, wherein feature rule extraction and merging: and (3) extracting the characteristic rules from the corresponding classification APK judged in the previous step so as to update and improve the rules in future, and merging the extracted characteristic rules into the rule set in the step S 4.
9. A computer readable storage medium storing a computer program, which when executed by a main controller implements the method of any of the preceding claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311471891.4A CN118277914A (en) | 2023-11-07 | 2023-11-07 | Mobile application classification method based on dynamic and static combination of multidimensional APK features |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311471891.4A CN118277914A (en) | 2023-11-07 | 2023-11-07 | Mobile application classification method based on dynamic and static combination of multidimensional APK features |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118277914A true CN118277914A (en) | 2024-07-02 |
Family
ID=91647616
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311471891.4A Pending CN118277914A (en) | 2023-11-07 | 2023-11-07 | Mobile application classification method based on dynamic and static combination of multidimensional APK features |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118277914A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9152694B1 (en) * | 2013-06-17 | 2015-10-06 | Appthority, Inc. | Automated classification of applications for mobile devices |
CN107133248A (en) * | 2016-02-29 | 2017-09-05 | 阿里巴巴集团控股有限公司 | The sorting technique and device of a kind of application program |
CN110245273A (en) * | 2019-06-21 | 2019-09-17 | 武汉绿色网络信息服务有限责任公司 | A kind of method obtaining APP service feature library and corresponding device |
-
2023
- 2023-11-07 CN CN202311471891.4A patent/CN118277914A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9152694B1 (en) * | 2013-06-17 | 2015-10-06 | Appthority, Inc. | Automated classification of applications for mobile devices |
CN107133248A (en) * | 2016-02-29 | 2017-09-05 | 阿里巴巴集团控股有限公司 | The sorting technique and device of a kind of application program |
CN110245273A (en) * | 2019-06-21 | 2019-09-17 | 武汉绿色网络信息服务有限责任公司 | A kind of method obtaining APP service feature library and corresponding device |
Non-Patent Citations (2)
Title |
---|
MARTINA LINDORFER 等: "MARVIN: Efficient and Comprehensive Mobile App Classification Through Static and Dynamic Analysis", 《2015 IEEE 39TH ANNUAL INTERNATIONAL COMPUTERS, SOFTWARE & APPLICATIONS CONFERENCE》, 24 September 2015 (2015-09-24), pages 422 - 433 * |
吴月明 等: "图卷积网络的抗混淆安卓恶意软件检测", 《软件学报》, vol. 34, no. 6, 1 June 2023 (2023-06-01), pages 2526 - 2542 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110569361B (en) | Text recognition method and equipment | |
US8935197B2 (en) | Systems and methods for facilitating open source intelligence gathering | |
CN108509482B (en) | Question classification method and device, computer equipment and storage medium | |
US20190310988A1 (en) | Systems and methods for identifying documents based on citation history | |
US9218364B1 (en) | Monitoring an any-image labeling engine | |
TWI424325B (en) | Systems and methods for organizing collective social intelligence information using an organic object data model | |
CN107862070B (en) | Online classroom discussion short text instant grouping method and system based on text clustering | |
US20100211551A1 (en) | Method, system, and computer readable recording medium for filtering obscene contents | |
CN107862022B (en) | Culture resource recommendation system | |
CN103455411B (en) | The foundation of daily record disaggregated model, user behaviors log sorting technique and device | |
CN102541958A (en) | Method, device and computer equipment for identifying short text category information | |
CN108171073A (en) | A kind of private data recognition methods based on the parsing driving of code layer semanteme | |
US10387805B2 (en) | System and method for ranking news feeds | |
CN102428467A (en) | Similarity-Based Feature Set Supplementation For Classification | |
CN109933648B (en) | Real user comment distinguishing method and device | |
CN106919588A (en) | A kind of application program search system and method | |
KR100876214B1 (en) | Apparatus and method for context aware advertising and computer readable medium processing the method | |
US9521164B1 (en) | Computerized system and method for detecting fraudulent or malicious enterprises | |
CN118277914A (en) | Mobile application classification method based on dynamic and static combination of multidimensional APK features | |
CN111325562A (en) | Grain safety tracing system and method | |
CN105868271B (en) | Surname statistical method and device | |
CN104063514B (en) | Method for vertical search | |
JP2008282111A (en) | Similar document retrieval method, program and device | |
CN112818206A (en) | Data classification method, device, terminal and storage medium | |
CN111507105A (en) | Technical file similarity checking method based on semantic similarity analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |