CN115659356A - Method for realizing self-adaptive adjustment of path search depth based on abstract syntax tree - Google Patents
Method for realizing self-adaptive adjustment of path search depth based on abstract syntax tree Download PDFInfo
- Publication number
- CN115659356A CN115659356A CN202211394130.9A CN202211394130A CN115659356A CN 115659356 A CN115659356 A CN 115659356A CN 202211394130 A CN202211394130 A CN 202211394130A CN 115659356 A CN115659356 A CN 115659356A
- Authority
- CN
- China
- Prior art keywords
- contract
- path search
- ast
- search depth
- execution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Devices For Executing Special Programs (AREA)
Abstract
The invention belongs to the field of path search methods, in particular to a method for realizing adaptive adjustment of path search depth based on an abstract syntax tree, which aims at the problems that the existing contract with larger code line number has deeper function call and symbol execution cannot cover, the contract oriented to the symbol execution is basically a low-version intelligent contract without exceeding 200 lines, and the path is difficult to cover by the symbol execution in a long/complex contract, and the following scheme is proposed, and comprises the following steps: s1: inputting: taking an open source contract on EtherScan or an intelligent contract source code written by a developer as program input; s2: compiling: the intelligent contract is compiled through a solid compiler, and the AST information and the intelligent contract byte codes generated in the compiling process are collected.
Description
Technical Field
The invention relates to the technical field of path search methods, in particular to a method for realizing adaptive adjustment of path search depth based on an abstract syntax tree.
Background
At present, the following schemes are mainly used in the research similar to the invention:
the first scheme is as follows: research [1] provides Oyente, an intelligent contract static analysis technology based on symbolic execution, and aims to detect classic problems such as reentry, access control, integer overflow, timestamp dependence and the like in an intelligent contract.
Scheme II: research [2] proposes VerX, which is a detection tool for security attributes of intelligent contract projects, detects intelligent contracts through symbolic execution technology, and realizes data value prediction in the detection process of the intelligent contracts through acquisition of data types in AST, thereby facilitating subsequent attribute inspection.
As an early symbol execution tool, oyente only detects a more classic intelligent contract vulnerability and is mainly oriented to a low-version contract with relatively few code lines, and meanwhile, the depth of the Oyente in the path searching process cannot be dynamically adjusted, so that the Oyente cannot detect that a deep path called by a more complex function exists in the contract of the current version, thereby reducing the code coverage rate and the symbol execution efficiency and being incapable of effectively detecting the code; although VerX utilizes the AST information of the symbolic execution technology and the smart contract, it does not solve the problem that the deep path under the complex function call scenario in the contract cannot be covered. AST is used for predicting abstract variable values, so that contract attributes can be conveniently verified;
therefore, a method for implementing adaptive adjustment of path search depth based on abstract syntax tree is needed to solve the above-mentioned problems.
Disclosure of Invention
Based on the problems, the invention provides a method for realizing path search depth self-adaptive adjustment based on an abstract syntax tree, which aims to solve the important problem that the current contract with a large code line number has a deeper function call and cannot be covered by symbol execution, and the current contract oriented by many symbol execution tools is basically a low-version intelligent contract without exceeding 200 lines, so that the problem that the path is difficult to be covered by symbol execution in a long/complex contract is ignored. The method is based on the symbolic execution technology, combines AST information capable of reflecting intelligent contract source code semantics, adaptively adjusts the path search depth in the symbolic execution process, and avoids the problems that vulnerability positions cannot be detected and coverage rate is low in the production contracts.
In order to achieve the purpose, the invention adopts the following technical scheme:
the method for realizing the self-adaptive adjustment of the path search depth based on the abstract syntax tree comprises the following steps:
s1: inputting: taking an open source contract on EtherScan or an intelligent contract source code written by a developer as program input;
s2: compiling: compiling an intelligent contract through a Solidity compiler, and collecting AST information and intelligent contract byte codes generated in the compiling process;
s3: acquiring AST information in S2 by using an AST information extraction module to perform self-adaptive path search depth adjustment, wherein the AST information extraction module comprises an AST traversal search part and a call graph construction part;
s4: adjusting the path search depth: according to the call graph obtained in the S3, before the symbol execution, the path search depth of the symbol execution is adaptively adjusted according to the analysis of the extracted contract complexity, wherein the path search depth of the symbol execution is an important parameter for determining whether the program code can be covered as much as possible and detecting the program vulnerability;
s5: selecting a path search depth adjustment factor K for adaptively adjusting parameters of the symbol execution path search depth so as to improve the code coverage capability and vulnerability detection capability of the symbol execution on the complex contract and obtain the latest adaptively adjusted path search depth m;
s6: performing symbol execution: according to the intelligent contract byte code of the intelligent contract and the pre-designed execution operation code modeling and memory modeling, the path in the program is searched and constrained and solved by combining the adaptive path search depth m in the S5, and the vulnerability in the contract is analyzed through the specific constraint solving condition and parameters;
s7: and outputting a result: and outputting a vulnerability detection result according to the analysis result finished by the symbol execution.
Preferably, in S1, the EtherScan is a blockchain browser of the etherhouse network, and the EtherScan can be used for searching transactions, blocks, wallet addresses, smart contracts, and other on-chain data.
Preferably, in S2, the intelligent contract is a compilation result of an open source contract on EtherScan or a contract source code written by a developer in S1, and the path search depth in the subsequent symbol execution process is adaptively adjusted through AST information and a policy of path search depth adjustment, where the AST information is used as a basis for adaptive adjustment of the path search depth in the subsequent symbol execution process, and the intelligent contract bytecode is a subject of symbol execution, and the process of symbol execution is to be established on a disassembly operation code of the intelligent contract bytecode.
Preferably, in S2, the identity compiler is used to convert the intelligent contract code written by the developer into an Ethernet Virtual Machine (EVM) instruction code, and the EVM instruction code is uploaded to the ethernet via transaction packaging and finally parsed and executed via the EVM.
Preferably, in S3, the AST traversal search: the AST information of the intelligent contract can be obtained through an AST instruction of a solid compiler, the AST information is in a structured json data format, all information in a source code is shown in the json data structure in a key-value mode, for content in the source code, the AST of the contract is provided with a key value pair taking a node type as a key so as to mark the intention and the category of a source code statement, and all source code statements are shown in the node key value pair of the AST in a node mode.
Preferably, in S3, a call graph at a contract function level may be constructed according to a key-value pair set obtained by AST traversal search, and unlike a traditional contract control flow graph constructed based on an intelligent contract bytecode, inter-function call information extracted from source code AST information is clearly obtainable without dynamically generating and perfecting a whole graph structure in an execution process.
Preferably, in S4, in order to adaptively adjust the search policy for contracts that cannot be analyzed while symbol execution is in progress, the analysis result for the contract needs to be applied to policy adjustment before symbol execution.
Preferably, in said S5In S5, the calculation formula of K is:wherein D is max Representing the calling path with the longest path in the calling flow diagram obtained in the step 3, wherein the path comprises N nodes N 1 To N N The out degree of the ith node is O i ,
Theta is the scaling factor. By pair D max Calculating the out-degree sum from the 2 nd node to the n-1 st node to obtain the calling complexity existing on the longest path, and finally obtaining the calling complexity through calculation
Theta adjusts the value to obtain the final path search depth adjustment factor K, and then the formula is used for:
m=n-K
and finally, obtaining the latest self-adaptively adjusted path search depth m according to the originally set path search depth parameter value n and the value of K.
Preferably, in S5, a symbolic execution module is used during symbolic execution, the execution of opcode modeling and memory modeling is designed in advance before the input step and the compiling step, the symbolic execution module performs search and constraint solution on a path in the program in combination with the adaptive path search depth m in S5, and a vulnerability existing in a contract is analyzed through a specific constraint solution and parameters.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention firstly provides a contract path search depth dynamic adjustment strategy for a contract with a longer current code line number based on the contract abstract syntax tree information, and can solve the problem that the fixed search depth cannot detect the contract vulnerability position and the low coverage rate of symbol execution;
2. the method makes full use of the semantic information of the intelligent contract source code, can guide the symbolic execution based on the intelligent contract byte code while not influencing the symbolic execution efficiency, and leads the vulnerability detection and symbolic execution process to be more efficient;
3. the invention provides a method for realizing self-adaptive adjustment of path search depth based on an abstract syntax tree, which focuses on the current intelligent contract scene with complex and various logics and long code line number, solves the problems that the coverage rate is low and a vulnerability cannot be detected due to the fact that a deep path cannot be detected by the traditional intelligent contract symbol execution technology, and provides more comprehensive security guarantee for an intelligent contract.
Drawings
Fig. 1 is a block diagram of a method for implementing adaptive adjustment of path search depth based on an abstract syntax tree according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments.
Examples
Referring to fig. 1, the method for implementing adaptive adjustment of path search depth based on abstract syntax tree includes the following steps:
s1: inputting: taking an open source contract on EtherScan or an intelligent contract source code written by a developer as program input;
s2: compiling: compiling an intelligent contract through a Solidity compiler, and collecting AST information and intelligent contract byte codes generated in the compiling process;
s3: acquiring AST information in S2 by using an AST information extraction module to perform self-adaptive path search depth adjustment, wherein the AST information extraction module comprises an AST traversal search part and a call graph construction part;
s4: adjusting the path search depth: according to the call graph obtained in the S3, before the symbol execution, the path search depth of the symbol execution is adaptively adjusted according to the analysis of the extracted contract complexity, wherein the path search depth of the symbol execution is an important parameter for determining whether the program code can be covered as much as possible and detecting the program vulnerability;
s5: selecting a path search depth adjustment factor K for adaptively adjusting parameters of the symbol execution path search depth so as to improve the code coverage capability and vulnerability detection capability of the symbol execution on the complex contract and obtain the latest adaptively adjusted path search depth m;
s6: performing symbolic execution: according to the intelligent contract byte code of the intelligent contract and the pre-designed execution operation code modeling and memory modeling, the path in the program is searched and constrained and solved by combining the adaptive path search depth m in the S5, and the vulnerability in the contract is analyzed through the specific constraint solving condition and parameters;
s7: and outputting a result: and outputting a vulnerability detection result according to the analysis result finished by the symbol execution.
In this embodiment, in S1, etherScan is a blockchain browser of an ethernet network, where EtherScan can be used to search transactions, blocks, wallet addresses, smart contracts, and other on-chain data, in S2, a smart contract is a compilation result of an open source contract on EtherScan or a contract source code written by a developer in S1, a path search depth in a subsequent symbol execution process is adaptively adjusted through AST information and a policy of path search depth adjustment, where AST information is used as a basis for adaptive adjustment of a path search depth in the subsequent symbol execution process, and a smart contract bytecode is a subject of symbol execution, the process of symbol execution is established on a disassembly operation code of the smart contract bytecode, in S2, a soidity compiler functions to convert a smart contract code written by a developer into Ethernet Virtual Machine (EVM) instruction codes, and the EVM instruction codes are uploaded to the ethernet via transaction packaging and are finally parsed and executed by EVM, in S3, an AST traversal search: the AST information of the intelligent contract can be obtained through an AST instruction of a contract compiler, the AST information is in a structured json data format, all information in a source code is displayed in the json data structure in a key-value form, for the content in the source code, the AST of the contract has a key value pair with a node type as a key, so that the intention and the category of a source code statement are marked, all source code statements are displayed in the node key value pair of the AST in a node form, for example, the first row of the contract declares the version of the contract compiler, the node type of the statement in the AST is 'Pragmadiative', the node type of the statement declaring a variable in the contract is 'VariDeclaration', the node type of the statement declaring a function statement in the AST is 'functional definition', and the like, and the AST declares the entry type of the statement of the variable in the ASTInformation extraction mainly focuses on a node with a nodeType of 'functional Call', the node comprises a call to another function, and the node is obtained by searching and extracting information of the node in AST<function_id,functioncall_ref_id>In S3, according to key value pair set obtained by AST traversal search, a call graph of contract function level can be constructed, different from the traditional contract control flow graph constructed based on intelligent contract byte code, the call information between functions extracted from source code AST information can be clearly obtained, and the whole graph structure is not required to be dynamically generated and perfected in the execution process<1,2>,<2,3>And<3,None>for the whole call graph constructed according to AST information, the nodes of the graph are function _ ids, namely 1, 2 and 3 in the above example, and the connecting lines in the graph are generated according to function _ ref _ id, and the function node to which the second value (function _ ref _ id) of the key-value pair points is connected for the first value (function id) in each key-value pair by traversing all key-value pairs. The call path and the distance from a function can be clearly obtained by constructing the call flow graph, for example, in the above example, the distance from the function 1 to the function 3 is 2, and the call path includes 1->2 and 2->In S3,s 4, in order to deal with the contract that cannot be analyzed in the process of executing the symbol execution, the analysis result of the contract needs to be applied to the policy adjustment before the symbol execution, and in S5, the calculation formula of K is:wherein D is max Representing the calling path with the longest path in the calling flow diagram obtained in the step 3, wherein the path comprises N nodes N 1 To N N Egress of the ith nodeDegree of O i And theta is the scaling factor. By pair D max Calculating the out-degree sum from the 2 nd node and the (n-1) th node to obtain the calling complexity existing on the longest path, finally adjusting the value through theta to obtain the final path search depth adjustment factor K, and then obtaining the final path search depth adjustment factor K through a formula:
m=n-K
and finally, obtaining the latest self-adaptively adjusted path search depth m according to the originally set path search depth parameter value n and the value K, wherein in S5, a symbol execution module is used during symbol execution, the execution of operation code modeling and memory modeling is designed in advance before the input step and the compiling step, the symbol execution module is combined with the self-adaptive path search depth m in S5 to search and constrain solution the path in the program, and the vulnerability in the contract is analyzed through a specific constraint solution condition and parameters.
The invention provides a method for realizing self-adaptive adjustment of path search depth based on an abstract syntax tree, which focuses on the current intelligent contract scene with complex and various logics and long code line number, solves the problems that the coverage rate is low and a vulnerability cannot be detected due to the fact that a deep path cannot be detected by the traditional intelligent contract symbol execution technology, and provides more comprehensive security guarantee for an intelligent contract.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.
Claims (9)
1. The method for realizing the self-adaptive adjustment of the path search depth based on the abstract syntax tree is characterized by comprising the following steps of:
s1: inputting: taking an open source contract on EtherScan or an intelligent contract source code written by a developer as program input;
s2: compiling: compiling an intelligent contract through a Solidity compiler, and collecting AST information and intelligent contract byte codes generated in the compiling process;
s3: acquiring AST information in S2 by using an AST information extraction module to perform self-adaptive path search depth adjustment, wherein the AST information extraction module comprises an AST traversal search part and a call graph construction part;
s4: adjusting the path search depth: according to the call graph obtained in the S3, before the symbol execution, the path search depth of the symbol execution is adaptively adjusted according to the analysis of the extracted contract complexity, wherein the path search depth of the symbol execution is an important parameter for determining whether the program code can be covered as much as possible and detecting the program vulnerability;
s5: selecting a path search depth adjustment factor K for adaptively adjusting parameters of the symbol execution path search depth so as to improve the code coverage capability and vulnerability detection capability of the symbol execution on the complex contract and obtain the latest adaptively adjusted path search depth m;
s6: performing symbolic execution: modeling and memory modeling according to intelligent contract byte codes and pre-designed execution operation codes of the intelligent contracts, searching and constraint solving paths in the programs by combining the self-adaptive path search depth m in the S5, and analyzing vulnerabilities in the contracts through specific constraint solving conditions and parameters;
s7: and outputting a result: and outputting a vulnerability detection result according to the analysis result finished by the symbol execution.
2. The method of claim 1, wherein in S1, the EtherScan is a blockchain browser of an ethernet network, and the EtherScan is available for searching transactions, blocks, wallet addresses, smart contracts, and other on-chain data.
3. The method of claim 1, wherein in S2, the intelligent contract is a compilation result of an open source contract on EtherScan or a contract source code written by a developer in S1, and the path search depth in the subsequent symbol execution process is adaptively adjusted according to AST information and a policy of path search depth adjustment, wherein the AST information is used as a basis for adaptive adjustment of the path search depth in the subsequent symbol execution process, and the intelligent contract bytecode is a subject of symbol execution, and the process of symbol execution is to be established on a disassembly operation code of the intelligent contract bytecode.
4. The method of claim 1, wherein in S2, a Solidity compiler is used to convert intelligent contract codes written by developers into Ethernet Virtual Machine (EVM) instruction codes, and the EVM instruction codes are uploaded to an ethernet via transaction packaging and finally parsed and executed via an EVM.
5. The method of claim 1, wherein in S3, the AST traversal search: the AST information of the intelligent contract can be obtained through an AST instruction of a solid compiler, the AST information is in a structured json data format, all information in a source code is displayed in the json data structure in a key-value mode, for content in the source code, the AST of the contract is provided with a key value pair taking a node type as a key, so that the intention and the category of a source code statement are marked, and all the source code statements are displayed in the node key value pair of the AST in a node mode.
6. The method of claim 1, wherein in S3, a contract function level call graph can be constructed according to a key value pair set obtained by an AST traversal search, and unlike a conventional contract control flow graph constructed based on an intelligent contract bytecode, inter-function call information extracted from source code AST information is clearly obtained without dynamically generating and perfecting an entire graph structure during execution, and in addition, a call graph and a control graph of a contract are different, each node represents a function, and connection lines between nodes represent call relationships existing between functions.
7. The method according to claim 1, wherein in S4, in order to adaptively adjust the search policy for the contract that cannot be analyzed in the process of executing the symbol, the analysis result of the contract needs to be applied to the policy adjustment before executing the symbol.
8. The method according to claim 1, wherein in S5, K is calculated as:wherein D is max The call path with the longest path in the call flow graph obtained in the step 3 of the table includes N nodes N 1 To N N The out degree of the ith node is O i And theta is the scaling factor. By pair D max Calculating the out-degree sum from the 2 nd node and the (n-1) th node to obtain the calling complexity existing on the longest path, finally adjusting the value through theta to obtain the final path search depth adjustment factor K, and then obtaining the final path search depth adjustment factor K through a formula:
m=n-K
and finally, obtaining the latest self-adaptively adjusted path search depth m according to the originally set path search depth parameter value n and the value K.
9. The method according to claim 1, wherein in S5, a symbolic execution module is used during symbolic execution, where the execution of opcode modeling and memory modeling is pre-designed before the input step and the compilation step, and the symbolic execution module performs search and constraint solution on the paths in the program in combination with the adaptive path search depth m in S5, and analyzes the vulnerabilities in contracts through specific constraint solution conditions and parameters.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211394130.9A CN115659356A (en) | 2022-11-08 | 2022-11-08 | Method for realizing self-adaptive adjustment of path search depth based on abstract syntax tree |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211394130.9A CN115659356A (en) | 2022-11-08 | 2022-11-08 | Method for realizing self-adaptive adjustment of path search depth based on abstract syntax tree |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115659356A true CN115659356A (en) | 2023-01-31 |
Family
ID=85017055
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211394130.9A Pending CN115659356A (en) | 2022-11-08 | 2022-11-08 | Method for realizing self-adaptive adjustment of path search depth based on abstract syntax tree |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115659356A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116069667A (en) * | 2023-03-06 | 2023-05-05 | 天津卓朗昆仑云软件技术有限公司 | Test case auxiliary positioning method and device based on code analysis |
US11861335B1 (en) * | 2023-07-28 | 2024-01-02 | Intuit Inc. | Learn to extract from syntax tree |
-
2022
- 2022-11-08 CN CN202211394130.9A patent/CN115659356A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116069667A (en) * | 2023-03-06 | 2023-05-05 | 天津卓朗昆仑云软件技术有限公司 | Test case auxiliary positioning method and device based on code analysis |
US11861335B1 (en) * | 2023-07-28 | 2024-01-02 | Intuit Inc. | Learn to extract from syntax tree |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112100054B (en) | Data management and control oriented program static analysis method and system | |
CN111125716B (en) | Method and device for detecting Ethernet intelligent contract vulnerability | |
Dávid et al. | Foundations for streaming model transformations by complex event processing | |
CN115659356A (en) | Method for realizing self-adaptive adjustment of path search depth based on abstract syntax tree | |
CN110399133B (en) | JavaScript code optimization method based on front-end byte code technology | |
CN106547520B (en) | Code path analysis method and device | |
CN103116540A (en) | Dynamic symbol execution method and device based on global superblock domination graph | |
CN110674503B (en) | Intelligent contract endless loop detection method based on graph convolution neural network | |
CN111209211A (en) | Cross-project software defect prediction method based on long-term and short-term memory neural network | |
CN107515739A (en) | Improve the method and device of code execution performance | |
CN107358099B (en) | Useless variable detection method based on LLVM intermediate representation program slicing technology | |
WO2012051844A1 (en) | Intelligent network platform, method for executing services and method for analyzing service abnormity | |
CN117150497A (en) | Intelligent contract denial of service vulnerability detection method based on symbolic execution constraint optimization | |
CN103793653B (en) | A kind of program dependence based on tree optimization analyzes method and system | |
CN105302551B (en) | A kind of method and system of the Orthogonal Decomposition construction and optimization of big data processing system | |
Zhang et al. | Smart contract vulnerability detection method based on bi-lstm neural network | |
CN106681781A (en) | Implementation method and system for real-time computing service | |
CN113505278A (en) | Graph matching method and device, electronic equipment and storage medium | |
CN110309656B (en) | Implicit type conversion security detection method | |
CN116383832A (en) | Intelligent contract vulnerability detection method based on graph neural network | |
KR20180135528A (en) | Method of creating the balanced parse tree having optimized height | |
CN115562674B (en) | Load-aware software configuration parameter adjustment method | |
Zech et al. | Towards Risk--Driven Security Testing of Service Centric Systems | |
CN116594869A (en) | Memory defect static detection method based on type region model | |
CN113703774B (en) | Feature Envy code bad smell detection method based on dependency structure characteristics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |