[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3453483.3454068acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article
Public Access

Boosting SMT solver performance on mixed-bitwise-arithmetic expressions

Published: 18 June 2021 Publication History

Abstract

Satisfiability Modulo Theories (SMT) solvers have been widely applied in automated software analysis to reason about the queries that encode the essence of program semantics, relieving the heavy burden of manual analysis. Many SMT solving techniques rely on solving Boolean satisfiability problem (SAT), which is an NP-complete problem, so they use heuristic search strategies to seek possible solutions, especially when no known theorem can efficiently reduce the problem. An emerging challenge, named Mixed-Bitwise-Arithmetic (MBA) obfuscation, impedes SMT solving by constructing identity equations with both bitwise operations (and, or, negate) and arithmetic computation (add, minus, multiply). Common math theorems for bitwise or arithmetic computation are inapplicable to simplifying MBA equations, leading to performance bottlenecks in SMT solving.
In this paper, we first scrutinize solvers' performance on solving different categories of MBA expressions: linear, polynomial, and non-polynomial. We observe that solvers can handle simple linear MBA expressions, but facing a severe performance slowdown when solving complex linear and non-linear MBA expressions. The root cause is that complex MBA expressions break the reduction laws for pure arithmetic or bitwise computation. To boost solvers' performance, we propose a semantic-preserving transformation to reduce the mixing degree of bitwise and arithmetic operations. We first calculate a signature vector based on the truth table extracted from an MBA expression, which captures the complete MBA semantics. Next, we generate a simpler MBA expression from the signature vector. Our large-scale evaluation on 3000 complex MBA equations shows that our technique significantly boost modern SMT solvers' performance on solving MBA formulas.

References

[1]
Sebastian Banescu and Alexander Pretschner. 2018. Chapter Five - A Tutorial on Software Obfuscation.
[2]
Sébastien Bardin, Robin David, and Jean-Yves Marion. 2017. Backward-Bounded DSE: Targeting Infeasibility Questions on Obfuscated Codes. In Proceedings of the 38th IEEE Symposium on Security and Privacy (S&P’17). https://doi.org/10.1109/SP.2017.36
[3]
Clark Barrett, Aaron Stump, and Cesare Tinelli. 2010. The SMT-LIB Standard – Version 2.0. In Proceedings of the 8th International Workshop on Satisfiability Modulo Theories.
[4]
Michael Beeler, R William Gosper, and Richard Schroeppel. 1972. Hakmem. Massachusetts Institute of Technology, Artificial Intelligence Laboratory.
[5]
Fabrizio Biondi, Sébastien Josse, and Axel Legay. 2016. Bypassing Malware Obfuscation with Dynamic Synthesis. https://ercim-news.ercim.eu/en106/special/bypassing-malware-obfuscation-with-dynamic-synthesis
[6]
Sandrine Blazy and Rémi Hutin. 2019. Formal Verification of a Program Obfuscation Based on Mixed Boolean-arithmetic Expressions. In Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP’19). https://doi.org/10.1145/3293880.3294103
[7]
Tim Blazytko, Moritz Contag, Cornelius Aschermann, and Thorsten Holz. 2017. Syntia: Synthesizing the Semantics of Obfuscated Code. In Proceedings of the 26th USENIX Security Symposium (USENIX Security’17).
[8]
Ella Bounimova, Patrice Godefroid, and David Molnar. 2013. Billions and Billions of Constraints: Whitebox Fuzz Testing in Production. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). https://doi.org/10.1109/ICSE.2013.6606558
[9]
Robert Brummayer and Armin Biere. 2009. Boolector: An Efficient SMT Solver for Bit-Vectors and Arrays. In Proceedings of the 15th International Conference on Tools and Algorithms for the Construction and Analysis of Systems. https://doi.org/10.1007/978-3-642-00768-2_16
[10]
Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. In Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI’08).
[11]
Jianhui Chen and Fei He. 2018. Control Flow-Guided SMT Solving for Program Verification. In Proceedings of the 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE’18). https://doi.org/10.1145/3238147.3238218
[12]
Christian Collberg, Sam Martin, Jonathan Myers, and Jasvir Nagra. 2012. Distributed Application Tamper Detection via Continuous Software Updates. In Proceedings of the 28th Annual Computer Security Applications Conference (ACSAC’12). https://doi.org/10.1145/2420950.2420997
[13]
Christian Collberg, Sam Martin, Jonathan Myers, and Bill Zimmerman. [n.d.]. Documentation for Arithmetic Encodings in Tigress. http://tigress.cs.arizona.edu/transformPage/docs/encodeArithmetic
[14]
Christian Collberg, Sam Martin, Jonathan Myers, and Bill Zimmerman. [n.d.]. Documentation for Data Encodings in Tigress. http://tigress.cs.arizona.edu/transformPage/docs/encodeData
[15]
Stephen A Cook. 1971. The Complexity of Theorem-Proving Procedures. In Proceedings of the 3rd Annual ACM Symposium on Theory of Computing. 151–158. https://doi.org/10.1145/800157.805047
[16]
Leonardo De Moura and Grant Olney Passmore. 2013. The Strategy Challenge in SMT Solving. In Automated Reasoning and Mathematics.
[17]
Ninon Eyrolles. 2017. Obfuscation with Mixed Boolean-Arithmetic Expressions: Reconstruction, Analysis and Simplification Tools. Ph.D. Dissertation. Université Paris-Saclay.
[18]
Ninon Eyrolles, Louis Goubin, and Marion Videau. 2016. Defeating MBA-based Obfuscation. In Proceedings of the 2016 ACM Workshop on Software PROtection (SPRO’16). https://doi.org/10.1145/2995306.2995308
[19]
Zhoulai Fu and Zhendong Su. 2016. XSat: A Fast Floating-Point Satisfiability Solver. In Proceedings of the 28th International Conference on Computer Aided Verification (CAV’16). https://doi.org/10.1007/978-3-319-41540-6_11
[20]
Vijay Ganesh and David L. Dill. 2007. A Decision Procedure for Bit-vectors and Arrays. In Proceedings of the 19th International Conference in Computer Aided Verification (CAV’07). https://doi.org/10.1007/978-3-540-73368-3_52
[21]
Vijay Ganesh, Adam Kiezun, Shay Artzi, Philip J. Guo, Pieter Hooimeijer, and Michael D. Ernst. 2011. HAMPI: A String Solver for Testing, Analysis and Vulnerability Detection. In Proceedings of 23rd International Conference on Computer Aided Verification (CAV’11). https://doi.org/10.1007/978-3-642-22110-1_1
[22]
Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed Automated Random Testing. In Proceedings of the 2005 ACM Conference on Programming Language Design and Implementation (PLDI’05). https://doi.org/10.1145/1065010.1065036
[23]
Patrice Godefroid, Michael Y. Levin, and David Molnar. 2008. Automated Whitebox Fuzz Testing. In Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS’08).
[24]
Irdeto. 2017. Irdeto Cloaked CA: a secure, flexible and cost-effective conditional access system. www.irdeto.com
[25]
Xiangyang Jia, Carlo Ghezzi, and Shi Ying. 2015. Enhancing Reuse of Constraint Solutions to Improve Symbolic Execution. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’15). https://doi.org/10.1145/2771783.2771806
[26]
Xavier Leroy. 2009. Formal Verification of a Realistic Compiler. Commun. ACM, 52, 7 (2009), July.
[27]
Xin Li, Yongjuan Liang, Hong Qian, Yi-Qi Hu, Lei Bu, Yang Yu, Xin Chen, and Xuandong Li. 2016. Symbolic Execution of Complex Program Driven by Machine Learning Based Constraint Solving. In Proceedings of 31st IEEE/ACM International Conference on Automated Software Engineering (ASE’16).
[28]
Clifford Liem, Yuan Xiang Gu, and Harold Johnson. 2008. A Compiler-based Infrastructure for Software-protection. In Proceedings of the 3rd ACM SIGPLAN Workshop on Programming Languages and Analysis for Security (PLAS’08). https://doi.org/10.1145/1375696.1375702
[29]
Daniel Liew, Cristian Cadar, Alastair F Donaldson, and J Ryan Stinnett. 2019. Just Fuzz It: Solving Floating-Point Constraints using Coverage-Guided Fuzzing. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’19). https://doi.org/10.1145/3338906.3338921
[30]
Haoyu Ma, Chunfu Jia, Shijia Li, Wantong Zheng, and Dinghao Wu. 2019. Xmark: Dynamic Software Watermarking Using Collatz Conjecture. IEEE Transactions on Information Forensics and Security, 14, 11 (2019), March.
[31]
MapleSoft. 2020. The Essential Tool for Mathematics. https://www.maplesoft.com/products/maple/
[32]
Aleksandar Milicevic, Joseph P Near, Eunsuk Kang, and Daniel Jackson. 2015. Alloy*: A General-Purpose Higher-Order Relational Constraint Solver. In Proceedings of the 37th International Conference on Software Engineering (ICSE’15). https://doi.org/10.1007/s10703-016-0267-2
[33]
Camille Mougey and Francis Gabriel. 2014. DRM Obfuscation Versus Auxiliary Attacks. In REcon Conference.
[34]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’08). https://doi.org/10.1007/978-3-540-78800-3_24
[35]
Aina Niemetz, Mathias Preiner, and Armin Biere. 2016. Boolector at the SMT Competition 2017. Institute for Formal Models and Verification, Johannes Kepler University.
[36]
Mathilde Ollivier, Sébastien Bardin, Richard Bonichon, and Jean-Yves Marion. 2019. How to Kill Symbolic Deobfuscation for Free (or: Unleashing the Potential of Path-oriented Protections). In Proceedings of the 35th Annual Computer Security Applications Conference (ACSAC’19). https://doi.org/10.1145/3359789.3359812
[37]
Mathilde Ollivier, Sébastien Bardin, Richard Bonichon, and Jean-Yves Marion. 2019. Obfuscation: Where Are We in anti-DSE Protections? (a First Attempt). In Proceedings of the 9th Workshop on Software Security, Protection, and Reverse Engineering (SSPREW’19). https://doi.org/10.1145/3371307.3371309
[38]
Corina S Păsăreanu, Neha Rungta, and Willem Visser. 2011. Symbolic Execution with Mixed Concrete-Symbolic Solving. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’11). https://doi.org/10.1145/2001420.2001425
[39]
Quarkslab. 2019. Epona Application Protection v1.5. https://epona.quarkslab.com
[40]
sagemath. 2020. SageMath. http://www.sagemath.org/
[41]
Koushik Sen, Darko Marinov, and Gul Agha. 2005. CUTE: A Concolic Unit Testing Engine for C. In Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering. https://doi.org/10.1145/1095430.1081750
[42]
Monirul Sharif, Andrea Lanzi, Jonathon Giffin, and Wenke Lee. 2008. Impeding Malware Analysis Using Conditional Code Obfuscation. In Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS’08).
[43]
Julian Thomé, Lwin Khin Shar, Domenico Bianculli, and Lionel Briand. 2017. Search-Driven String Constraint Solving for Vulnerability Detection. In Proceedings of 39th International Conference on Software Engineering (ICSE’17). https://doi.org/10.1109/ICSE.2017.26
[44]
H.S. Warren. 2003. Hacker’s Delight. Addison-Wesley.
[45]
WOLFRAM. 2020. WOLFRAM MATHEMATICA. http://www.wolfram.com/mathematica/
[46]
Yunhui Zheng, Xiangyu Zhang, and Vijay Ganesh. 2013. Z3-str: A Z3-Based String Solver for Web Application Analysis. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’13). https://doi.org/10.1145/2491411.2491456
[47]
Yongxin Zhou and Alec Main. 2006. Diversity via Code Transformations: A Solution for NGNA Renewable Security. The National Cable and Telecommunications Association Show.
[48]
Yongxin Zhou, Alec Main, Yuan X. Gu, and Harold Johnson. 2007. Information Hiding in Software with Mixed Boolean-Arithmetic Transforms. In Proceedings of the 8th International Conference on Information Security Applications (WISA’07). https://doi.org/10.1007/978-3-540-77535-5_5

Cited By

View all
  • (2024)A Method to Quantitative Compare Obfuscating TtransformationsСпособ количественного сравнения обфусцирующих преобразованийInformatics and AutomationИнформатика и автоматизация10.15622/ia.23.3.323:3(684-726)Online publication date: 28-May-2024
  • (2024)SMT Theory Arbitrage: Approximating Unbounded Constraints using Bounded TheoriesProceedings of the ACM on Programming Languages10.1145/36563878:PLDI(246-271)Online publication date: 20-Jun-2024
  • (2024)Concrete Constraint Guided Symbolic ExecutionProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639078(1-12)Online publication date: 20-May-2024
  • Show More Cited By

Index Terms

  1. Boosting SMT solver performance on mixed-bitwise-arithmetic expressions

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation
    June 2021
    1341 pages
    ISBN:9781450383912
    DOI:10.1145/3453483
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. Mixed Boolean Arithmetic
    2. SMT Solvers
    3. Simplification

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    PLDI '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 406 of 2,067 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)454
    • Downloads (Last 6 weeks)74
    Reflects downloads up to 14 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Method to Quantitative Compare Obfuscating TtransformationsСпособ количественного сравнения обфусцирующих преобразованийInformatics and AutomationИнформатика и автоматизация10.15622/ia.23.3.323:3(684-726)Online publication date: 28-May-2024
    • (2024)SMT Theory Arbitrage: Approximating Unbounded Constraints using Bounded TheoriesProceedings of the ACM on Programming Languages10.1145/36563878:PLDI(246-271)Online publication date: 20-Jun-2024
    • (2024)Concrete Constraint Guided Symbolic ExecutionProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639078(1-12)Online publication date: 20-May-2024
    • (2024)Defeating Data Plane Attacks With Program ObfuscationIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.327793921:3(1317-1330)Online publication date: May-2024
    • (2024)X-MBA: Towards Heterogeneous Mixed Boolean-Arithmetic DeobfuscationMILCOM 2024 - 2024 IEEE Military Communications Conference (MILCOM)10.1109/MILCOM61039.2024.10773656(1082-1087)Online publication date: 28-Oct-2024
    • (2023)On Simplifying Expressions with Mixed Boolean-ArithmeticModeling and Analysis of Information Systems10.18255/1818-1015-2023-2-140-15930:2(140-159)Online publication date: 14-Jun-2023
    • (2023)Speeding up SMT Solving via Compiler OptimizationProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616357(1177-1189)Online publication date: 30-Nov-2023
    • (2023)Simplifying Mixed Boolean-Arithmetic Obfuscation by Program Synthesis and Term RewritingProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623186(2351-2365)Online publication date: 15-Nov-2023
    • (2023) Anchor: Fast and Precise Value-flow Analysis for Containers via Memory OrientationACM Transactions on Software Engineering and Methodology10.1145/356580032:3(1-39)Online publication date: 26-Apr-2023
    • (2023)Reverse Engineering of Obfuscated Lua Bytecode via Interpreter Semantics TestingIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.328925418(3891-3905)Online publication date: 1-Jan-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media