[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN108491228A - A kind of binary vulnerability Code Clones detection method and system - Google Patents

A kind of binary vulnerability Code Clones detection method and system Download PDF

Info

Publication number
CN108491228A
CN108491228A CN201810267094.7A CN201810267094A CN108491228A CN 108491228 A CN108491228 A CN 108491228A CN 201810267094 A CN201810267094 A CN 201810267094A CN 108491228 A CN108491228 A CN 108491228A
Authority
CN
China
Prior art keywords
function
code
feature
binary
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810267094.7A
Other languages
Chinese (zh)
Other versions
CN108491228B (en
Inventor
姜宇
杨鑫
高健
顾明
孙家广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201810267094.7A priority Critical patent/CN108491228B/en
Publication of CN108491228A publication Critical patent/CN108491228A/en
Application granted granted Critical
Publication of CN108491228B publication Critical patent/CN108491228B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/75Structural analysis for program understanding
    • G06F8/751Code clone detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A kind of binary vulnerability Code Clones detection method of present invention offer and system, wherein method include:The Function feature of the Function feature and the second function in binary vulnerability code of the first function in binary code to be detected is extracted, Function feature includes basic block message, control stream information and function call information;The Function feature of first function and the Function feature of second function are inputted to default neural network respectively, utilize the similarity of default neural computing first function and second function;When similarity reaches predetermined threshold value, determine that there are the cloned codes of binary vulnerability code in binary code to be detected.This method and system solve the clone type present in existing Code Clones detection method detection not comprehensively, accuracy is low, complexity is high and be not easy to realize the problems such as, it ensures the accuracy of the comprehensive and testing result of clone type detection, while effectively improving detection efficiency.

Description

A kind of binary vulnerability Code Clones detection method and system
Technical field
The present invention relates to computer program detection technique fields, more particularly, to a kind of binary vulnerability Code Clones Detection method and system.
Background technology
Till now from the birth of software industry, as computer user's quantity quickly increases, software industry is swift and violent, Penetrate into the every aspect of people's work and life.Many software source codes open on the internet, and developer is in internet It inquires required correlative code and has become a kind of quickly and effectively mode of production.Replicate one section of code, by simply changing or Person directly replicates stickup, then applies in new scene, is the common phenomenon in present software development process.This code weight Mode is known as Code Clones.There are a large amount of cloned codes sections in large-scale software development.As meters such as X Windows System The generally acknowledged high-quality system in calculation machine field still has 19% Code Clones;There is 15%~25% code in the core of Linux Clone;In open source software project, has more than 50% code and be re-used.There are a large amount of cloned codes in large-scale software development Section, such as:Linux kernel source code 22.3% is referred to as the code reused or even sub-fraction code segment is replicated At least 8 times;The bug code for having used 145 non-patch installing is confirmed in the linux kernel (2.6.6 editions) of Debian packets.If Cloned codes are pregnable, then relevant safety problem large area may be caused to break out, because announced loophole may It is present in other software, but cannot repairs completely.Using the information for announcing loophole, attacker can utilize gram of non-patch installing Grand code causes serious influence to software systems.Therefore, there is an urgent need to a kind of accurately and effectively methods to detect the easy of clone Under fire code.
Cloned codes are divided into four classes by the universal text similarity according to source code with functional similarity in the industry at present:1) it removes Space and the outer all identical code segment of annotation;2) in addition to all identical code on identifier, type, space and the outer syntax of annotation Section;3) the replicating code section for add drop/change has been done to sentence;4) different code segment on functionally identical but syntax.Wherein, one 1st class is known as clone completely by a little researchers, and the 2nd, 3 classes are known as approximate clone, and the 4th class is known as semantic clone.
Domestic and foreign scholars have proposed many clone's detection methods and technology and develop corresponding clone's detection instrument.These Method can generally be divided into based on text, based on morphology (token), based on grammer (syntax), based on semanteme etc..
1) text based detection method.This method is that processing is directly compared on the source code of software systems (only Filter the difference on the annotation and layout of source code), without source code is converted into certain intermediate representation.Johnson It is put forward for the first time text based clone's detection technique:First by the code segment Hash of fixed line number, increment Hash letter is then utilized It counts to identify the code segment with identical cryptographic Hash i.e. cloned codes, difference is searched in combination with sliding window technique is used The cloned codes of length.
2) detection method based on morphology.This method (the also referred to as method based on token) uses morphological analysis work first Every a line of institute's source code is converted into a token sequence by tool (such as lex), and all sequences are connected into a token String;Then this token is scanned to search similar token subsequences, then reports the source generation corresponding to these similar substrings Code is clone.
3) detection method based on grammer.This method be similar syntactic structure should also be had according to similar code segment and Design.Program is resolvable to a syntax tree, wherein the source code fragment corresponding to similar subtree is exactly cloned codes. Baxter et al. for the first time applies abstract syntax tree (AST) technology in cloned codes detection, and source code is parsed into band first There is the syntax tree of mark, then subtree is hashing onto in N number of bucket (bucket), it is then similar to the subtree comparison in the same bucket Property, and then obtain cloned codes.
4) semantic-based detection method.The technology is mainly in program dependency graph (PDG) method as representative, i.e., one given Program establishes the set of a PDG, the isomorphism in this set according to the data flow between program statement with control dependence Code segment corresponding to subgraph is cloned codes.In recent years, some scholars go detection semantically phase using the method for dynamic analysis As code segment compare their output knot if Jiang of University of California et al. to code segment by giving one group of input data Fruit, and then obtain semantically similar cloned codes.Marcus et al. use informations retrieval technique (potential semantic indexing) is gone Statically the source code of analyzing software system detects semantic clone in turn.
The 1st, 2 classes clone can be effectively detected in method based on token, and Space-time Complexity is relatively low, be also not required to consider program The correctness of syntax, and independently of source code, but when handling the clone of the 3rd class, there are many flase drops for meeting.Method based on grammer 1-3 classes clone can be effectively detected, but due to need to be by code analysis at similar subtree is searched after AST again, so space-time is complicated It spends higher.Compared with the comparative approach based on grammer, analysis source code is gone from a higher level based on the technology of PDG, to obtain The semantic information of calling program, so the technology can detect that some are disturbed sequence but semantic identical code segment.But it establishes PDG and the cost for finding the cost of isomorphism subgraph are also very high, it is difficult to be applied to large scope software.
It can be seen that the detection method of existing Code Clones is detected there are clone type, not comprehensive, accuracy is low, complicated The problems such as degree height is not easy to realize.
Invention content
The present invention is not comprehensive, accurate in order to overcome the clone type present in the detection method of existing Code Clones to detect Low, complexity high the problems such as being not easy to realize is spent, a kind of binary vulnerability Code Clones detection method and system are provided.
On the one hand, the present invention provides a kind of binary vulnerability Code Clones detection method, including:
Extract the second letter in the Function feature and binary vulnerability code of the first function in binary code to be detected Several Function features, the Function feature include basic block message, control stream information and function call information;
The Function feature of the first function and the Function feature of the second function are inputted to default neural network respectively, Utilize the similarity of first function and the second function described in the default neural computing;
When the similarity reaches predetermined threshold value, determine that there are binary system leakages in the binary code to be detected The cloned codes of hole code.
Preferably, the similarity according to all first functions and the second function in the binary code to be detected is equal Less than the predetermined threshold value, clone's generation that the binary vulnerability code is not present in the binary code to be detected is determined Code.
Preferably, described to utilize the similar of first function described in the default neural computing and the second function Degree, specifically includes:
Using the default neural network according to the basic of the basic block message of the first function and the second function Block message obtains every in second function described in the corresponding first subcharacter vector sum of each basic block in the first function respectively The corresponding second subcharacter vector of a basic block;
According to the control stream information of the first function and function call information to all first subcharacter vectors into Row processing, obtains first eigenvector;According to the control stream information of the second function and function call information to all described Second subcharacter vector is handled, and second feature vector is obtained;
The included angle cosine value for calculating the first eigenvector and second feature vector, the included angle cosine value is true It is set to the similarity of the first function and the second function.
Preferably, the basic block message of the first function and the second function respectively includes the first function and institute State the number of the initial address of each basic block and the quantity of the corresponding numeric constant of each basic block, character constant in second function Amount, is situated between the quantity of transfer instruction, the quantity of function call, the quantity of instruction, the quantity of arithmetic instruction, the quantity of logical order Number centrality and child node quantity.
Preferably, the control stream information of the first function and the second function respectively includes the first function and institute State the dependence between each two basic block in second function.
Preferably, the function call information of the first function includes the function initial address called by the first function With the function initial address for calling the first function;The function call information of the second function includes by the second function The function initial address of calling and the function initial address for calling the second function.
Preferably, the method further includes:
The first sample function and the second sample function of the label with characterization clone/non-clone are obtained, and described in extraction The Function feature of the Function feature of first sample function and second sample function;
The default neural network is built, and the target error of the default neural network is set;
Described in the Function feature of the Function feature of the first sample function and second sample function is inputted respectively Default neural network, is trained the default neural network;
When the reality output result of the default neural network and the difference of desired output result are missed no more than the target When poor, the default neural metwork training terminates.
Preferably, the first sample function and the second sample function for obtaining the label with characterization clone/non-clone, It specifically includes:
Multiple sample functions are obtained, each sample function is subjected to different configuration of cross compile, is obtained multiple same Name function and multiple non-functions of the same name;
By the first sample function and the second sample letter of label of the function composition with characterization clone of the same name described in each two Number, by non-first sample function and second sample letter of the function composition with the label for characterizing non-clone of the same name described in each two Number.
On the one hand, the present invention provides a kind of binary vulnerability Code Clones detecting system, including:
Characteristic extracting module, the Function feature for extracting the first function in binary code to be detected and binary system leakage The Function feature of second function in the code of hole, the Function feature include basic block feature, control stream information and function call Information;
Similarity calculation module, for dividing the Function feature of the first function and the Function feature of the second function Neural network Shu Ru not be preset, the similar of first function described in the default neural computing and the second function is utilized Degree;
Clone detection module, for when the similarity reaches predetermined threshold value, determining the binary code to be detected It is middle that there are the cloned codes of the binary vulnerability code.
On the one hand, the present invention provides a kind of equipment of binary vulnerability Code Clones detection method, including:
At least one processor;And
At least one processor being connect with the processor communication, wherein:
The memory is stored with the program instruction that can be executed by the processor, and the processor calls described program to refer to It enables and is able to carry out any of the above-described method.
A kind of binary vulnerability Code Clones detection method and system provided by the invention, by extracting binary system to be detected The Function feature of the Function feature and the second function in binary vulnerability code of first function in code;Again by first function Function feature and the Function feature of second function input default neural network respectively, utilize the first letter of default neural computing The similarity of number and second function;When the similarity reaches predetermined threshold value, then it can determine and deposited in binary code to be detected In the cloned codes of binary vulnerability code.This method and system can accurately realize bug code on the basis of binary Clone's detection has general applicability without obtaining source code;Simultaneously nerve net is inputted by fine granularity of basic block for code Network carries out deep learning, and then realizes clone's detection of code, solves gram present in existing Code Clones detection method Grand type detection not comprehensively, accuracy is low, complexity is high and be not easy to realize the problems such as, it is ensured that clone type detection it is comprehensive The accuracy of property and testing result, while effectively improving detection efficiency.
Description of the drawings
Fig. 1 is a kind of overall flow schematic diagram of binary vulnerability Code Clones detection method of the embodiment of the present invention;
Fig. 2 is the overall flow schematic diagram of the function similarity calculating method of the embodiment of the present invention;
Fig. 3 be the embodiment of the present invention default neural network in Function feature vector calculating process structural schematic diagram;
Fig. 4 is the overall flow schematic diagram of the training process of the default neural network of the embodiment of the present invention;
Fig. 5 is a kind of overall structure diagram of binary vulnerability Code Clones detecting system of the embodiment of the present invention;
Fig. 6 is a kind of structural framing signal of the equipment of binary vulnerability Code Clones detection method of the embodiment of the present invention Figure.
Specific implementation mode
With reference to the accompanying drawings and examples, the specific implementation mode of the present invention is described in further detail.Implement below Example is not limited to the scope of the present invention for illustrating the present invention.
It should be noted that in existing cloned codes analytical technology research, carried out based on source code, base It is less in the research of binary code.But in some cases, we are unable to get source code, and such as most of business software will not be sent out Cloth source code, at this moment carrying out similitude detection using binary file is just particularly important.In view of this, the present invention provides one Kind binary vulnerability Code Clones detection method, including:Extract the Function feature of the first function in binary code to be detected With the Function feature of the second function in binary vulnerability code, Function feature includes basic block message, control stream information and letter Number recalls information;The Function feature of first function and the Function feature of second function are inputted to default neural network respectively, utilized The similarity of default neural computing first function and second function;When similarity reaches predetermined threshold value, determine to be detected There are the cloned codes of binary vulnerability code in binary code.
Referring to Fig. 1, Fig. 1 is a kind of overall flow of binary vulnerability Code Clones detection method of the embodiment of the present invention Schematic diagram, as shown in Figure 1, the binary vulnerability Code Clones detection method, including:
S101 extracts in the Function feature and binary vulnerability code of the first function in binary code to be detected The Function feature of two functions, Function feature include basic block message, control stream information and function call information;
It is detected it should be noted that the detection method in the present embodiment is the form based on binary code, when When code to be detected is source code, binary code need to be first converted source code into.In addition, existing bug code is generally single letter It counts, i.e., general in existing bug code only there are one functions.Therefore, when code to be detected includes multiple functions, Ying Xian Function in code to be detected is split, then each function is detected successively.Simultaneously because existing disclosed loophole There are a variety of bug codes in Dai Ku, thus when need in code to be detected to existing disclosed loophole for library in institute it is leaky When code is detected, each bug code should be detected respectively, now with the detection of a function in code to be detected It is illustrated for process:
It obtains binary code and binary vulnerability code to be detected first by compiler, utilizes binary code dis-assembling Tool extracts the function in binary code to be detected, using the function as first function;Binary code dis-assembling work is utilized simultaneously Function in tool extraction binary vulnerability code, using the function as second function.At the same time, binary code dis-assembling is utilized Tool extracts the Function feature of the Function feature and second function of first function.Function feature includes basic block message, control stream Information and function call information, wherein basic block message is the relevant information of basic block included in a function, control stream Information is the dependence between each basic block in a function, and function call information is other functions of function call and should The case where function is by other function calls.As can be seen that by basic block message, control stream information and function call information can be with One function is described comprehensively.
The Function feature of first function and the Function feature of second function are inputted default neural network, profit by S102 respectively With the similarity of default neural computing first function and second function;
Specifically, on the basis of the Function feature of above-mentioned acquisition first function and the Function feature of second function, by The Function feature of one function and the Function feature of second function input default neural network respectively, wherein default neural network is pre- It is first trained, it presets neural network and first function is calculated according to the Function feature of first function and the Function feature of second function With the similarity of second function, and by preset neural network output layer export similarity result of calculation.
S103 determines that there are binary vulnerability codes in binary code to be detected when similarity reaches predetermined threshold value Cloned codes.
Specifically, on the basis of above-mentioned calculating obtains the similarity of first function and second function, acquisition will be calculated Similarity and predetermined threshold value are compared, and predetermined threshold value is the critical value of pre-set similarity.When calculate obtain it is similar When degree reaches predetermined threshold value, then it can determine that there are the cloned codes of binary vulnerability code in binary code to be detected.
A kind of binary vulnerability Code Clones detection method provided by the invention, by extracting in binary code to be detected First function Function feature and the second function in binary vulnerability code Function feature;Again by the function of first function The Function feature of feature and second function inputs default neural network respectively, utilizes default neural computing first function and the The similarity of two functions;When the similarity reaches predetermined threshold value, then can determine in binary code to be detected there are two into The cloned codes of bug code processed.This method can accurately realize clone's detection of bug code, nothing on the basis of binary Source code need to be obtained, there is general applicability;Neural network is inputted as fine granularity using basic block for code simultaneously and carries out depth Study, and then clone's detection of code is realized, solve the clone type detection present in existing Code Clones detection method Not comprehensively, accuracy is low, complexity is high and be not easy to realize the problems such as, it is ensured that clone type detection it is comprehensive and detection knot The accuracy of fruit, while effectively improving detection efficiency.
Based on any of the above-described embodiment, provide a kind of binary vulnerability Code Clones detection method, according to be detected two into The similarity of all first functions and second function is respectively less than predetermined threshold value in code processed, determines in binary code to be detected not There are the cloned codes of binary vulnerability code.
Specifically, when binary code to be detected includes multiple first functions, the side in above-described embodiment should be used Method calculates the similarity of second function in each first function and binary code bug code in binary code to be detected. When the similarity of all first functions and second function in binary vulnerability code in binary code to be detected is respectively less than pre- If when threshold value, then can determine the cloned codes that binary vulnerability code is not present in binary code to be detected.
A kind of binary vulnerability Code Clones detection method provided by the invention owns according in binary code to be detected The similarity of first function and second function is respectively less than predetermined threshold value, determines that there is no binary systems to leak in binary code to be detected The cloned codes of hole code.This method can accurately realize clone's detection of bug code on the basis of binary, without obtaining Source code is obtained, there is general applicability;Neural network is inputted as fine granularity using basic block for code simultaneously and carries out deep learning, And then clone's detection of code is realized, the clone type detection solved present in existing Code Clones detection method is not complete Face, accuracy is low, complexity is high and the problems such as being not easy to realize, it is ensured that the comprehensive and testing result of clone type detection Accuracy, while effectively improving detection efficiency.
Based on any of the above-described embodiment, a kind of binary vulnerability Code Clones detection method is provided, Fig. 2 is referred to, it is above-mentioned Using the similarity of default neural computing first function and second function in step S102, specifically include:
S1021, using default neural network according to the basic block message of first function and the basic block message of second function Each basic block is corresponding in the corresponding first subcharacter vector sum second function of each basic block in acquisition first function respectively Second subcharacter vector;
Specifically, the input for the Function feature of first function and the Function feature of second function being passed through into default neural network Neural network is preset in layer input, and wherein Function feature includes basic block message, control stream information and function call information.In this base On plinth, the basic block message that neural network receives first function is preset, the basic block information of first function includes in first function The initial address of each basic block, for being identified to each basic block, and the unstructured feature of each basic block and Structured features.Default neural network is corresponding according to each basic block in the basic block information acquisition first function of first function First subcharacter vector.Similarly, the basic block message that neural network receives second function, the basic block message of second function are preset The initial address for including each basic block in second function, for being identified to each basic block, and each basic block Unstructured feature and structure feature.Default neural network is according to every in the basic block information acquisition second function of second function The corresponding second subcharacter vector of a basic block.
S1022 carries out all first subcharacter vectors according to the control stream information of first function and function call information Processing obtains first eigenvector;According to the control stream information of second function and function call information to all second subcharacters Vector is handled, and second feature vector is obtained;
It specifically, will be in the corresponding first subcharacter vector sum second function of each basic block in above-mentioned acquisition first function The hidden layer of neural network and full articulamentum are preset in the corresponding second subcharacter vector input of each basic block, hidden layer and are connected entirely Layer is connect respectively to carry out all first subcharacter vectors according to the control stream information and function call information of the first function of input Processing obtains first eigenvector;Hidden layer and full articulamentum are according to the control stream information and function tune of the second function of input All second subcharacter vectors are handled respectively with information, obtain second feature vector.
The neuron of hidden layer is believed when handling all first subcharacter vectors according to the control stream of first function Breath determines the dependence between all first subcharacter vectors, if there is no other subcharacter vectors to depend on some neuron Corresponding subcharacter vector, then need to only correspond to a upper hidden layer when neuron handles its corresponding subcharacter vector The value of neuron is multiplied with preset first coefficient matrix;If there are other subcharacter vectors to depend on some neuron pair A upper hidden layer is being corresponded to nerve by the subcharacter vector answered when then the neuron handles its corresponding subcharacter vector The value of member is also needed on the basis of being multiplied with preset first coefficient matrix plus other subcharacters dependent on subcharacter vector Vector corresponds to the product of the value and preset second coefficient matrix of neuron in a upper hidden layer.
Based on above-mentioned principle, after hidden layer is to all first subcharacter Vector Processings, by treated, all first sons are special Sign vector inputs full articulamentum after being integrated, and full articulamentum determines corresponding function according to the function call information of first function Subcharacter vector is called, finally carries out the corresponding function call subcharacter vector of all first subcharacter vector sums after integration The corresponding first eigenvector of first function is obtained after splicing.Similarly, after hidden layer is to all second subcharacter Vector Processings, Will treated inputs full articulamentum after all second subcharacter vectors are integrated, full articulamentum is according to the function of second function Recalls information determines corresponding function call subcharacter vector, finally that all second subcharacter vector sums after integration are corresponding Function call subcharacter vector obtains the corresponding second feature vector of second function after being spliced.
S1023 calculates the included angle cosine value of first eigenvector and second feature vector, and included angle cosine value is determined as the The similarity of one function and second function.
Specifically, the corresponding first eigenvector of above-mentioned acquisition first function and the corresponding second feature of second function to After amount, the full articulamentum of default neural network also needs to calculate the included angle cosine value of first eigenvector and second feature vector, And finally will calculate the included angle cosine value obtained and exported by presetting the output layer of neural network, the included angle cosine value of output It can be identified as the similarity of first function and the second function.Wherein, the value range of similarity is (- 1,1), when first When the similarity of function and second function is close to 1, then it can determine that first function and second function clone function each other, when first When the similarity of function and second function is close to -1, then first function and second function non-clone's function each other are can determine.
A kind of binary vulnerability Code Clones detection method provided by the invention, using default neural network according to the first letter The basic block message of several basic block messages and second function obtains corresponding first son of each basic block in first function respectively The corresponding second subcharacter vector of each basic block in feature vector and second function;According to the control stream information of first function and Function call information handles all first subcharacter vectors, obtains first eigenvector;According to the control of second function Stream information and function call information handle all second subcharacter vectors, obtain second feature vector;It is special to calculate first Included angle cosine value, is determined as the phase of first function and the second function by the included angle cosine value for levying vector sum second feature vector Like degree.This method inputs neural network as fine granularity using basic block for code and carries out deep learning, and then realizes code Clone's detection, solve clone type detection present in existing Code Clones detection method not comprehensively, accuracy is low, complexity It is high and the problems such as be not easy to realize, it is ensured that the accuracy of the comprehensive and testing result of clone type detection, while effectively carrying Detection efficiency is risen.
For the ease of understanding the method in any embodiment, now illustrated with following examples:
As shown in figure 3, structures of the Fig. 3 for Function feature vector calculating process in the default neural network of the embodiment of the present invention Schematic diagram is illustrated by taking the calculating process of single function feature vector as an example in figure.Function in figure shares 3 basic blocks (X1、X2、X3), include altogether 7 hidden layers in figure, first to three neuron u in first layer hidden layer1 0、u2 0、u3 0It carries out just Beginningization, initial value is 64 dimension full 0 vectors respectively.The subscript of neuron is corresponded with basic block subscript, i.e. basic block X1It is corresponding First layer hidden layer neuron beThen continue to calculate the next layer of each neuron of hidden layer according to formula as follows Value.
In above formula,It indicates to depend on basic block XiOther basic blocks subscript, node,Expression and basic block XiCorresponding t layers of implicit unit, P (i) indicate that the node of the calling function, S (i) are indicated by the node of the function call, W1,W2,W3,P1,P2To preset the parameter matrix of neural network, xiIndicate basic block XiSubcharacter vector, σ () be activation letter Number.
From the foregoing, it will be observed that in Fig. 3,The neuron of second layer hidden layer:And so on progress It calculates, the μ being finally calculated is the corresponding feature vector of the function, and in the present embodiment, μ is 64 dimensional feature vectors.
Similarly, the feature vector that another function is calculated according to above-mentioned computational methods, eventually by the spy for calculating two functions The included angle cosine value for levying vector, then can determine the similarity of two functions, and then can determine whether two functions clone letter each other Number.
A kind of binary vulnerability Code Clones detection method, first function and the second letter are provided based on any of the above-described embodiment Several basic block messages respectively includes the initial address of each basic block and each basic block pair in first function and second function The quantity for the numeric constant answered, the quantity of transfer instruction, the quantity of function call, the quantity of instruction, is calculated the quantity of character constant Quantity, the quantity of logical order, betweenness center and the child node quantity of art instruction.
Specifically, the basic block information of first function includes the initial address of each basic block in first function, Mei Geji The initial address of this block is the mark of the basic block, for uniquely determining the basic block.In addition, the basic block message of first function Further include quantity, the quantity of character constant, the quantity of transfer instruction, the function call of the corresponding numeric constant of each basic block Quantity, the quantity of instruction, the quantity of the quantity of arithmetic instruction and logical order totally 7 unstructured features and each basic block pair The betweenness center and child node quantity answered totally 2 structured features.7 unstructured features and 2 structured features combinations Form corresponding 9 dimensional vector of each basic block.
Similarly, the basic block information of second function includes the initial address of each basic block in second function, each basic The initial address of block is the mark of the basic block, for uniquely determining the basic block.In addition, the basic block message of second function is also The quantity of quantity, character constant including the corresponding numeric constant of each basic block, the number of the quantity of transfer instruction, function call Totally 7 unstructured features and each basic block correspond to the quantity of the quantity, the quantity of arithmetic instruction and the logical order that measure, instruct Betweenness center and child node quantity totally 2 structured features.7 unstructured features and 2 structured features combine shape At corresponding 9 dimensional vector of each basic block.
From the foregoing, it will be observed that default neural network is according to the basic block message of first function and the basic block message of second function It can get in first function each basic block in each corresponding first subcharacter vector sum second function of basic block corresponding the Two subcharacters vector, first subcharacter vector sum the second subcharacter vector is above-mentioned 9 dimensional vector.
A kind of binary vulnerability Code Clones detection method provided by the invention, the basic block by obtaining first function are believed The basic block message of breath and second function inputs neural network as fine granularity using basic block and carries out deep learning, and then realizes The clone of code detects, solve clone type detection present in existing Code Clones detection method not comprehensively, accuracy it is low, Complexity is high and the problems such as being not easy to realize, it is ensured that the accuracy of the comprehensive and testing result of clone type detection, simultaneously Effectively improve detection efficiency.
A kind of binary vulnerability Code Clones detection method, first function and the second letter are provided based on any of the above-described embodiment Several control stream informations respectively include the dependence in first function and second function between each two basic block.
Specifically, the control stream information of first function includes the dependence in first function between each two basic block, The control stream information of second function includes the dependence in second function between each two basic block.Wherein, dependence is determined The trend that parameter in default neural network calculated, adjusted process is determined, has preset neural network and believed according to the control stream of first function Breath and the control stream information of second function are respectively calculated processing, the corresponding first eigenvector of final output first function with The corresponding second feature vector of second function.In addition, dependence can carry out table by relying on the form of topological diagram or array Show, can be configured according to actual demand, be not specifically limited herein.
A kind of binary vulnerability Code Clones detection method provided by the invention, the control stream by obtaining first function are believed The control stream information of breath and second function is conducive to control stream information and second function of the default neural network according to first function Control stream information be respectively calculated processing, the corresponding first eigenvector of final output first function and second function correspond to Second feature vector, and then realize code clone detection, solve gram present in existing Code Clones detection method Grand type detection not comprehensively, accuracy is low, complexity is high and be not easy to realize the problems such as, it is ensured that clone type detection it is comprehensive The accuracy of property and testing result, while effectively improving detection efficiency.
A kind of binary vulnerability Code Clones detection method, the function tune of first function are provided based on any of the above-described embodiment Include the function initial address of the function initial address called by first function and calling first function with information;Second function Function call information includes the function initial address called by second function and calls the function initial address of second function.
Specifically, the function call information of first function includes the function initial address and calling first that first function is called The function initial address of function, you can determine the calling of first function and other functions and called relationship.The letter of second function Number recalls information includes the function initial address and the function initial address for calling second function that second function is called, you can is determined The calling and called relationship of second function and other functions.Wherein, the initial address of function is the unique mark of function, is used for Uniquely determine a certain function.Default neural network is believed according to the function call information of first function and the function call of second function Breath is respectively calculated processing, the corresponding first eigenvector of final output first function and the corresponding second feature of second function Vector.
A kind of binary vulnerability Code Clones detection method provided by the invention, by the function call for obtaining first function The function call information of information and second function is conducive to default neural network according to the function call information of first function and the The function call information of two functions is respectively calculated processing, the corresponding first eigenvector of final output first function and second The corresponding second feature vector of function, and then clone's detection of code is realized, solve existing Code Clones detection method institute The detection of existing clone type not comprehensively, accuracy is low, complexity is high and the problems such as being not easy to realize, it is ensured that clone type is examined The accuracy for the comprehensive and testing result surveyed, while effectively improving detection efficiency.
A kind of binary vulnerability Code Clones detection method is provided based on any of the above-described embodiment, referring to Fig. 4, this method is also Include the steps that being trained to default neural network, detailed process is as follows:
S401, obtains the first sample function and the second sample function of the label with characterization clone/non-clone, and extracts The Function feature of the Function feature and the second sample function of institute's first sample function;
Specifically, the first sample function and the second sample function of the label with characterization clone/non-clone are obtained, including Obtain the first sample letter of the first sample function and the second sample function and the label of the non-clone of characterization of characterization clonal marker Number and the second sample function.Wherein, first sample function and the second sample function with characterization clonal marker mean the first sample This function and the second sample function clone function each other;With the first sample function and the second sample letter for characterizing non-clonal marker Number means first sample function and the second sample function non-clone's function each other.In the present embodiment, it can obtain multigroup with table The first sample function and the second sample function of the label of sign clone/non-clone, to form multiple training samples to default nerve Network is trained, and can be configured according to actual demand, is not specifically limited herein.
The function of the Function feature and the second sample function of first sample function is extracted using binary code disassemblers Feature.Function feature includes basic block message, control stream information and function call information, wherein basic block message is a function Included in basic block relevant information, control stream information is the dependence in function between each basic block, function Recalls information is the case where other functions of function call and the function are by other function calls.As can be seen that by basic Block message, control stream information and function call information can describe a function comprehensively.
S402 builds and presets neural network, and the target error for presetting neural network is arranged;
Specifically, build the model of neural network, in the present embodiment, the model of neural network include input layer, hidden layer, Full articulamentum and output layer.In other embodiments, presetting the model of neural network can be built according to actual demand, this Place is not specifically limited.On the basis of structure presets neural network, the target error for presetting neural network is set, which misses The desired value of error of the difference between the real output value and desired output of default neural network.Target error value can basis Actual demand is configured, and is not specifically limited herein.
The Function feature of the Function feature of first sample function and the second sample function is inputted default nerve by S403 respectively Network is trained default neural network;
Specifically, on the basis of the Function feature of the Function feature and the second sample function of above-mentioned acquisition first sample function On, the Function feature of the Function feature of first sample function and the second sample function is inputted to above-mentioned structure respectively and presets nerve net In network, default neural network is trained.
S404, when the reality output result of default neural network and the difference of desired output result are not more than target error When, default neural metwork training terminates.
Specifically, in the process being trained using first sample function and the second sample function to presetting neural network In, when the difference of the reality output result of default neural network and desired output result is not more than target error, preset nerve Network training terminates.
A kind of binary vulnerability Code Clones detection method provided by the invention clones/non-gram by obtaining with characterization The first sample function and the second sample function of grand label, and extract the Function feature and the second sample letter of first sample function Several Function features;Structure presets neural network, and the target error for presetting neural network is arranged;By the letter of first sample function The Function feature of number feature and the second sample function inputs default neural network respectively, is trained to default neural network;When When the reality output result of default neural network and the difference of desired output result are not more than target error, neural network instruction is preset White silk terminates.This method is real using trained default neural network to be conducive to by being trained to default neural network The clone of modern code detects, and solves the clone type present in existing Code Clones detection method and detects not comprehensive, accuracy It is low, complexity is high and the problems such as being not easy to realize, it is ensured that the accuracy of the comprehensive and testing result of clone type detection, together When effectively improve detection efficiency.
A kind of binary vulnerability Code Clones detection method is provided based on any of the above-described embodiment, is obtained with characterization gram The first sample function and the second sample function of the label of grand/non-clone, specifically include:
Multiple sample functions are obtained, each sample function is subjected to different configuration of cross compile, obtains multiple letters of the same name Number and multiple non-functions of the same name;
It, will by the first sample function and the second sample function of each two label of the function composition with characterization clone of the same name Each two non-first sample function and second sample function of the function composition with the label for characterizing non-clone of the same name.
It should be noted that the binary code obtained under the compiling of different compilers due to identical function is in form It is upper possible different, therefore in order to ensure presetting the accuracy of neural network, suitable training sample need to be chosen.The present embodiment passes through Following manner obtains the first sample function and the second sample function of the label with characterization clone/non-clone:
Specifically, in the first sample function and the second sample function for obtaining the label with characterization clone/non-clone In the process, multiple sample functions are obtained first, and the quantity of sample function can be configured according to actual demand, not do and have herein Body limits.Each sample function is subjected to different configuration of cross compile again, obtains multiple functions of the same name and multiple non-letters of the same name Number.Although the binary code that identical function obtains under the compiling of different compilers in form may be different, its is right The function name answered is identical.I.e. when the function name of two functions is identical, then it can determine that two functions are clone's function;When two When the function name of a function differs, then it can determine that two functions are non-clone's function.
Further, on the basis of the multiple functions of the same name of above-mentioned acquisition and multiple non-functions of the same name, each two is of the same name The first sample function and the second sample function of label of the function composition with characterization clone form the non-function of the same name of each two First sample function and the second sample function with the label for characterizing non-clone.
Using the first sample function and the second sample function pair of the label with characterization clone/non-clone of above-mentioned acquisition After default neural network is trained, even if there are two sections of binary codes be identical function under the compiling of different compilers The various forms of binary codes obtained, default neural network can also identify that two sections of binary codes are clone's generation Code.
A kind of binary vulnerability Code Clones detection method provided by the invention, by obtaining suitably with characterization gram The first sample function and the second sample function of the label of grand/non-clone are trained default neural network so that default god Various forms of binary codes that identical function obtains under the compiling of different compilers can be identified through network, be conducive to The accuracy for ensuring default neural network, to be conducive to clone's inspection using trained default neural fusion code Survey, solve clone type detection present in existing Code Clones detection method not comprehensively, accuracy is low, complexity is high and not The problems such as being easily achieved, it is ensured that the accuracy of the comprehensive and testing result of clone type detection, while effectively improving inspection Survey efficiency.
Fig. 5 is a kind of overall structure diagram of binary vulnerability Code Clones detecting system of the embodiment of the present invention, such as Shown in Fig. 5, the present invention provides a kind of binary vulnerability Code Clones detecting system, including characteristic extracting module 1, similarity calculation Module 2 and clone detection module 3 realize the binary vulnerability Code Clones in any of the above-described embodiment by the cooperation of each module Detection method is implemented as follows:
Characteristic extracting module 1, Function feature and binary system for extracting the first function in binary code to be detected The Function feature of second function in bug code, Function feature include basic block feature, control stream information and function call letter Breath;
Specifically, it obtains binary code and binary vulnerability code to be detected first by compiler, is carried using feature Modulus block 1 extracts the function in binary code to be detected using binary code disassemblers, using the function as first function; Simultaneously using characteristic extracting module 1 using the function in binary code disassemblers extraction binary vulnerability code, by the letter Number is used as second function.At the same time, first function is extracted using binary code disassemblers using characteristic extracting module 1 The Function feature of Function feature and second function.Function feature includes basic block message, controls stream information and function call information, Wherein basic block message is the relevant information of basic block included in a function, and control stream information is each base in a function Dependence between this block, function call information are other functions of function call and the function by other function calls Situation.As can be seen that can comprehensively be retouched to a function by basic block message, control stream information and function call information It states.
Similarity calculation module 2, for inputting the Function feature of the Function feature of first function and second function respectively Default neural network, utilizes the similarity of default neural computing first function and second function;
Specifically, it on the basis of the Function feature of above-mentioned acquisition first function and the Function feature of second function, utilizes The Function feature of first function and the Function feature of second function are inputted default neural network by similarity calculation module 2 respectively, Wherein default neural network is trained in advance, presets neural network according to the Function feature of first function and second function Function feature calculates the similarity of first function and second function, and the output layer by presetting neural network exports similarity Result of calculation.
Clone detection module 3, for when similarity reaches predetermined threshold value, determining that there are two in binary code to be detected The cloned codes of system bug code.
Specifically, it on the basis of above-mentioned calculating obtains the similarity of first function and second function, is detected using clone Module 3 will calculate the similarity obtained and predetermined threshold value is compared, and predetermined threshold value is the critical value of pre-set similarity. When the similarity for calculating acquisition reaches predetermined threshold value, then it can determine that there are binary vulnerability generations in binary code to be detected The cloned codes of code.
It should be noted that when binary code to be detected includes multiple first functions, above-described embodiment should be used In method calculate second function in each first function and binary code bug code in binary code to be detected Similarity.When the similarity of all first functions and second function in binary vulnerability code in binary code to be detected is equal When less than predetermined threshold value, then the cloned codes that binary vulnerability code is not present in binary code to be detected are can determine.
A kind of binary vulnerability Code Clones detecting system provided by the invention, by extracting in binary code to be detected First function Function feature and the second function in binary vulnerability code Function feature;Again by the function of first function The Function feature of feature and second function inputs default neural network respectively, utilizes default neural computing first function and the The similarity of two functions;When the similarity reaches predetermined threshold value, then can determine in binary code to be detected there are two into The cloned codes of bug code processed.The system can accurately realize clone's detection of bug code, nothing on the basis of binary Source code need to be obtained, there is general applicability;Neural network is inputted as fine granularity using basic block for code simultaneously and carries out depth Study, and then clone's detection of code is realized, solve the clone type detection present in existing Code Clones detection method Not comprehensively, accuracy is low, complexity is high and be not easy to realize the problems such as, it is ensured that clone type detection it is comprehensive and detection knot The accuracy of fruit, while effectively improving detection efficiency.
Fig. 6 shows a kind of structure diagram of the equipment of binary vulnerability Code Clones detection method of the embodiment of the present invention. Reference Fig. 6, the equipment of the binary vulnerability Code Clones detection method, including:Processor (processor) 61, memory (memory) 62 and bus 63;Wherein, the processor 61 and memory 62 complete mutual communication by the bus 63; The processor 61 is used to call the program instruction in the memory 62, to execute the side that above-mentioned each method embodiment is provided Method, such as including:Extract in the Function feature and binary vulnerability code of the first function in binary code to be detected The Function feature of two functions, Function feature include basic block message, control stream information and function call information;By first function Function feature and the Function feature of second function input default neural network respectively, utilize default neural computing first function With the similarity of second function;When similarity reaches predetermined threshold value, determine that there are binary system leakages in binary code to be detected The cloned codes of hole code.
The present embodiment discloses a kind of computer program product, and the computer program product includes being stored in non-transient calculating Computer program on machine readable storage medium storing program for executing, the computer program include program instruction, when described program instruction is calculated When machine executes, computer is able to carry out the method that above-mentioned each method embodiment is provided, such as including:Extract binary system to be detected The Function feature of the Function feature and the second function in binary vulnerability code of first function in code, Function feature include Basic block message, control stream information and function call information;By the Function feature of the Function feature of first function and second function It inputs respectively and presets neural network, utilize the similarity of default neural computing first function and second function;Work as similarity When reaching predetermined threshold value, determine that there are the cloned codes of binary vulnerability code in binary code to be detected.
The present embodiment provides a kind of non-transient computer readable storage medium, the non-transient computer readable storage medium Computer instruction is stored, the computer instruction makes the computer execute the method that above-mentioned each method embodiment is provided, example Such as include:Extract the second function in the Function feature and binary vulnerability code of the first function in binary code to be detected Function feature, Function feature includes basic block message, control stream information and function call information;The function of first function is special The Function feature for second function of seeking peace inputs default neural network respectively, utilizes default neural computing first function and second The similarity of function;When similarity reaches predetermined threshold value, determine that there are binary vulnerability codes in binary code to be detected Cloned codes.
One of ordinary skill in the art will appreciate that:Realize that all or part of step of above method embodiment can pass through The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer read/write memory medium, the program When being executed, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned includes:ROM, RAM, magnetic disc or light The various media that can store program code such as disk.
The embodiments such as the equipment of binary vulnerability Code Clones detection method described above are only schematical, Described in the unit that illustrates as separating component may or may not be physically separated, the portion shown as unit Part may or may not be physical unit, you can be located at a place, or may be distributed over multiple network lists In member.Some or all of module therein can be selected according to the actual needs to achieve the purpose of the solution of this embodiment.This Field those of ordinary skill is not in the case where paying performing creative labour, you can to understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It is realized by the mode of software plus required general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on Stating technical solution, substantially the part that contributes to existing technology can be expressed in the form of software products in other words, should Computer software product can store in a computer-readable storage medium, such as ROM/RAM, magnetic disc, CD, including several fingers It enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementation Method described in certain parts of example or embodiment.
Finally, the present processes are only preferable embodiment, are not intended to limit the scope of the present invention.It is all Within the spirit and principles in the present invention, any modification, equivalent replacement, improvement and so on should be included in the protection of the present invention Within the scope of.

Claims (10)

1. a kind of binary vulnerability Code Clones detection method, which is characterized in that including:
The Function feature for extracting the first function in binary code to be detected and second function in binary vulnerability code Function feature, the Function feature include basic block message, control stream information and function call information;
The Function feature of the first function and the Function feature of the second function are inputted to default neural network respectively, utilized The similarity of first function and the second function described in the default neural computing;
When the similarity reaches predetermined threshold value, determine that there are the binary vulnerability generations in the binary code to be detected The cloned codes of code.
2. according to the method described in claim 1, it is characterized in that, according to all first letters in the binary code to be detected Number and the similarity of the second function are respectively less than the predetermined threshold value, determine and institute is not present in the binary code to be detected State the cloned codes of binary vulnerability code.
3. according to the method described in claim 1, it is characterized in that, described utilize described in the default neural computing first The similarity of function and the second function, specifically includes:
Believed according to the basic block message of the first function and the basic block of the second function using the default neural network Breath obtains each base in second function described in the corresponding first subcharacter vector sum of each basic block in the first function respectively The corresponding second subcharacter vector of this block;
According to the control stream information of the first function and function call information to all first subcharacter vectors at Reason obtains first eigenvector;According to the control stream information of the second function and function call information to all described second Subcharacter vector is handled, and second feature vector is obtained;
The included angle cosine value is determined as by the included angle cosine value for calculating the first eigenvector and second feature vector The similarity of the first function and the second function.
4. method according to claim 1 or 3, which is characterized in that the first function and the second function it is basic Block message respectively includes the initial address of each basic block and each basic block pair in the first function and the second function The quantity for the numeric constant answered, the quantity of transfer instruction, the quantity of function call, the quantity of instruction, is calculated the quantity of character constant Quantity, the quantity of logical order, betweenness center and the child node quantity of art instruction.
5. method according to claim 1 or 3, which is characterized in that the control of the first function and the second function Stream information respectively includes the dependence between each two basic block in the first function and the second function.
6. method according to claim 1 or 3, which is characterized in that the function call information of the first function includes quilt The function initial address of function initial address and the calling first function that the first function is called;The second function Function call information includes the function initial address called by the second function and the function starting for calling the second function Address.
7. according to the method described in claim 1, it is characterized in that, the method further includes:
The first sample function and the second sample function of the label with characterization clone/non-clone are obtained, and extracts described first The Function feature of the Function feature of sample function and second sample function;
The default neural network is built, and the target error of the default neural network is set;
The Function feature of the Function feature of the first sample function and second sample function is inputted respectively described default Neural network is trained the default neural network;
When the difference of the reality output result of the default neural network and desired output result is not more than the target error, The default neural metwork training terminates.
8. the method according to the description of claim 7 is characterized in that described obtain with the label for characterizing clone/non-clone First sample function and the second sample function, specifically include:
Multiple sample functions are obtained, each sample function is subjected to different configuration of cross compile, obtains multiple letters of the same name Number and multiple non-functions of the same name;
It, will by the first sample function and the second sample function of label of the function composition with characterization clone of the same name described in each two Non- first sample function and second sample function of the function composition with the label for characterizing non-clone of the same name described in each two.
9. a kind of binary vulnerability Code Clones detecting system, which is characterized in that including:
Characteristic extracting module, the Function feature for extracting the first function in binary code to be detected and binary vulnerability generation The Function feature of second function in code, the Function feature include basic block feature, control stream information and function call information;
Similarity calculation module, for the Function feature of the first function and the Function feature of second function difference is defeated Enter default neural network, utilizes the similarity of first function and the second function described in the default neural computing;
Clone detection module, for when the similarity reaches predetermined threshold value, determining and being deposited in the binary code to be detected In the cloned codes of the binary vulnerability code.
10. a kind of equipment of binary vulnerability Code Clones detection method, which is characterized in that including:
At least one processor;And
At least one processor being connect with the processor communication, wherein:
The memory is stored with the program instruction that can be executed by the processor, and the processor calls described program to instruct energy Enough execute method as described in any of the claims 1 to 8.
CN201810267094.7A 2018-03-28 2018-03-28 Binary vulnerability code clone detection method and system Active CN108491228B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810267094.7A CN108491228B (en) 2018-03-28 2018-03-28 Binary vulnerability code clone detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810267094.7A CN108491228B (en) 2018-03-28 2018-03-28 Binary vulnerability code clone detection method and system

Publications (2)

Publication Number Publication Date
CN108491228A true CN108491228A (en) 2018-09-04
CN108491228B CN108491228B (en) 2020-03-17

Family

ID=63317013

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810267094.7A Active CN108491228B (en) 2018-03-28 2018-03-28 Binary vulnerability code clone detection method and system

Country Status (1)

Country Link
CN (1) CN108491228B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635569A (en) * 2018-12-10 2019-04-16 国家电网有限公司信息通信分公司 A kind of leak detection method and device
CN110134435A (en) * 2019-05-29 2019-08-16 北京百度网讯科技有限公司 A kind of code repairs case acquisition methods, device, equipment and storage medium
CN110287702A (en) * 2019-05-29 2019-09-27 清华大学 A kind of binary vulnerability clone detection method and device
CN110414238A (en) * 2019-06-18 2019-11-05 中国科学院信息工程研究所 The search method and device of homologous binary code
CN111124487A (en) * 2018-11-01 2020-05-08 浙江大学 Code clone detection method and device and electronic equipment
CN112613040A (en) * 2020-12-14 2021-04-06 中国科学院信息工程研究所 Vulnerability detection method based on binary program and related equipment
CN113901474A (en) * 2021-09-13 2022-01-07 四川大学 Vulnerability detection method based on function-level code similarity
CN117473494A (en) * 2023-06-06 2024-01-30 兴华永恒(北京)科技有限责任公司 Method and device for determining homologous binary files, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104407872A (en) * 2014-12-04 2015-03-11 北京邮电大学 Code clone detection method
CN107229563A (en) * 2016-03-25 2017-10-03 中国科学院信息工程研究所 A kind of binary program leak function correlating method across framework

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104407872A (en) * 2014-12-04 2015-03-11 北京邮电大学 Code clone detection method
CN107229563A (en) * 2016-03-25 2017-10-03 中国科学院信息工程研究所 A kind of binary program leak function correlating method across framework

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HONG LIN 等: "CVSSA: Cross-architecture Vulnerability Search in Firmware Based on Support Vector Machine and Attributed Control Flow Graph", 《FOURTH INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEM AND THEIR APPLICAITONS》 *
常青等: "VDNS:一种跨平台的固件漏洞关联算法", 《计算机研究与发展》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111124487A (en) * 2018-11-01 2020-05-08 浙江大学 Code clone detection method and device and electronic equipment
CN111124487B (en) * 2018-11-01 2022-01-21 浙江大学 Code clone detection method and device and electronic equipment
CN109635569A (en) * 2018-12-10 2019-04-16 国家电网有限公司信息通信分公司 A kind of leak detection method and device
CN109635569B (en) * 2018-12-10 2020-11-03 国家电网有限公司信息通信分公司 Vulnerability detection method and device
CN110134435A (en) * 2019-05-29 2019-08-16 北京百度网讯科技有限公司 A kind of code repairs case acquisition methods, device, equipment and storage medium
CN110287702A (en) * 2019-05-29 2019-09-27 清华大学 A kind of binary vulnerability clone detection method and device
CN110287702B (en) * 2019-05-29 2020-08-11 清华大学 Binary vulnerability clone detection method and device
CN110134435B (en) * 2019-05-29 2023-01-10 北京百度网讯科技有限公司 Code repair case acquisition method, device, equipment and storage medium
CN110414238A (en) * 2019-06-18 2019-11-05 中国科学院信息工程研究所 The search method and device of homologous binary code
CN112613040A (en) * 2020-12-14 2021-04-06 中国科学院信息工程研究所 Vulnerability detection method based on binary program and related equipment
CN113901474A (en) * 2021-09-13 2022-01-07 四川大学 Vulnerability detection method based on function-level code similarity
CN117473494A (en) * 2023-06-06 2024-01-30 兴华永恒(北京)科技有限责任公司 Method and device for determining homologous binary files, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN108491228B (en) 2020-03-17

Similar Documents

Publication Publication Date Title
CN108491228A (en) A kind of binary vulnerability Code Clones detection method and system
CN111459799B (en) Software defect detection model establishing and detecting method and system based on Github
Alrabaee et al. Oba2: An onion approach to binary code authorship attribution
Han et al. Malware analysis using visualized image matrices
CN106557695B (en) A kind of malicious application detection method and system
EP4058916A1 (en) Detecting unknown malicious content in computer systems
CN110287702B (en) Binary vulnerability clone detection method and device
CN106503558A (en) A kind of Android malicious code detecting methods that is analyzed based on community structure
Palahan et al. Extraction of statistically significant malware behaviors
CN109871686A (en) Rogue program recognition methods and device based on icon representation and software action consistency analysis
CN112115326B (en) Multi-label classification and vulnerability detection method for Etheng intelligent contracts
Yang et al. Asteria-Pro: Enhancing Deep Learning-based Binary Code Similarity Detection by Incorporating Domain Knowledge
Cao et al. FTCLNet: Convolutional LSTM with Fourier transform for vulnerability detection
KR20200110141A (en) Method for data processing to derive new drug candidate substance
CN116361788A (en) Binary software vulnerability prediction method based on machine learning
CN115455382A (en) Semantic comparison method and device for binary function codes
CN111400713B (en) Malicious software population classification method based on operation code adjacency graph characteristics
Zhao et al. Suzzer: A vulnerability-guided fuzzer based on deep learning
CN103679034B (en) A kind of computer virus analytic system based on body and feature extracting method thereof
US11164658B2 (en) Identifying salient features for instances of data
CN114139636B (en) Abnormal operation processing method and device
CN116702157B (en) Intelligent contract vulnerability detection method based on neural network
Utkin et al. Evaluating the impact of source code parsers on ML4SE models
Chen et al. Research on automatic vulnerability mining model based on knowledge graph
Grover et al. Malware threat analysis of IoT devices using deep learning neural network methodologies

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant