[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3324884.3416644acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Subdomain-based generality-aware debloating

Published: 27 January 2021 Publication History

Abstract

Programs are becoming increasingly complex and typically contain an abundance of unneeded features, which can degrade the performance and security of the software. Recently, we have witnessed a surge of debloating techniques that aim to create a reduced version of a program by eliminating the unneeded features therein. To debloat a program, most existing techniques require a usage profile of the program, typically provided as a set of inputs I. Unfortunately, these techniques tend to generate a reduced program that is over-fitted to I and thus fails to behave correctly for other inputs. To address this limitation, we propose DomGad, which has two main advantages over existing debloating approaches. First, it produces a reduced program that is guaranteed to work for subdomains, rather than for specific inputs. Second, it uses stochastic optimization to generate reduced programs that achieve a close-to-optimal tradeoff between reduction and generality (i.e., the extent to which the reduced program is able to correctly handle inputs in its whole domain). To assess the effectiveness of DomGad, we applied our approach to a benchmark of ten Unix utility programs. Our results are promising, as they show that DomGad could produce debloated programs that achieve, on average, 50% code reduction and 95% generality. Our results also show that DomGad performs well when compared with two state-of-the-art debloating approaches.

References

[1]
Ioannis Agadakos, Di Jin, David Williams-King, Vasileios P Kemerlis, and Georgios Portokalidis. 2019. Nibbler: debloating binary shared libraries. In Proceedings of the 35th Annual Computer Security Applications Conference (ACSAC). 70--83.
[2]
Abdulbaki Aydin, William Eiers, Lucas Bang, Tegan Brennan, Miroslav Gavrilov, Tevfik Bultan, and Fang Yu. 2018. Parameterized model counting for string and numeric constraints. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 400--410.
[3]
Babak Amin Azad, Pierre Laperdrix, and Nick Nikiforakis. 2019. Less is more: quantifying the security benefits of debloating web applications. In 28th USENIX Security Symposium (USENIX Security 19). 1697--1714.
[4]
Suparna Bhattacharya, Kanchi Gopinath, and Mangala Gowri Nanda. 2013. Combining concern input with program analysis for bloat detection. In Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages and applications (OOPSLA). ACM, 745--764.
[5]
Mateus Borges, Antonio Filieri, Marcelo d'Amorim, Corina S Păsăreanu, and Willem Visser. 2014. Compositional solution space quantification for probabilistic software analysis. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 123--132.
[6]
Erik Buchanan, Ryan Roemer, Hovav Shacham, and Stefan Savage. 2008. When good instructions go bad: Generalizing return-oriented programming to RISC. In Proceedings of the 15th ACM conference on Computer and communications security (CCS). ACM, 27--38.
[7]
Supratik Chakraborty, Kuldeep S Meel, Rakesh Mistry, and Moshe Y Vardi. 2016. Approximate probabilistic inference via word-level counting. In Thirtieth AAAI Conference on Artificial Intelligence (AAAI).
[8]
Yuting Chen, Ting Su, and Zhendong Su. 2019. Deep differential testing of JVM implementations. In Proceedings of the 41th International Conference on Software Engineering (ICSE). 1257--1268.
[9]
Herman Chernoff. 1952. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. The Annals of Mathematical Statistics (1952), 493--507.
[10]
Chisel 2020. Chisel. https://github.com/aspire-project/chisel (accessed on September 2020).
[11]
ChiselBench 2020. ChiselBench. https://github.com/aspire-project/chisel-bench (accessed on September 2020).
[12]
Arpit Christi, Alex Groce, and Rahul Gopinath. 2017. Resource adaptation via test-based software minimization. In 11th International Conference on Self-Adaptive and Self-Organizing Systems (SASO). 61--70.
[13]
Clang 2020. Clang: a C language family frontend for LLVM. https://clang.llvm.org/ (accessed on September 2020).
[14]
Debop 2020. Debop: Program Debloating via Stochastic Optimization. https://sites.google.com/view/debop19 (accessed on September 2020).
[15]
Kostas Ferles, Valentin Wüstholz, Maria Christakis, and Isil Dillig. 2017. Failure-directed program trimming. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE). 174--185.
[16]
Antonio Filieri, Corina S Păsăreanu, Willem Visser, and Jaco Geldenhuys. 2014. Statistical symbolic execution with informed sampling. In Proc. of the ACM SIGSOFT Intl. Symposium on Foundations of Software Engineering (FSE). 437--448.
[17]
Stuart Geman and Donald Geman. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (1984), 721--741.
[18]
Walter R Gilks, Sylvia Richardson, and David Spiegelhalter. 1995. Markov chain Monte Carlo in practice. Chapman and Hall/CRC.
[19]
Carla P Gomes, Ashish Sabharwal, and Bart Selman. 2006. Model counting: A new strategy for obtaining good bounds. In Conference on Artificial Intelligence (AAAI). 54--61.
[20]
Christian Gram Kalhauge and Jens Palsberg. 2019. Binary reduction of dependency graphs. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 556--566.
[21]
Roman Haas, Rainer Niedermayr, Tobias Roehm, and Sven Apel. 2020. Is Static Analysis Able to Identify Unnecessary Source Code? ACM Transactions on Software Engineering and Methodology (TOSEM) (2020), 1--23.
[22]
Kihong Heo, Woosuk Lee, Pardis Pashakhanloo, and Mayur Naik. 2018. Effective Program Debloating via Reinforcement Learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS). 380--394.
[23]
Curt Hibbs, Steve Jewett, and Mike Sullivan. 2009. The art of lean software development: a practical and incremental approach. "O'Reilly Media, Inc.".
[24]
Jianjun Huang, Yousra Aafer, David Perry, Xiangyu Zhang, and Chen Tian. 2017. UI driven Android application reduction. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 286--296.
[25]
Sumit K Jha, Edmund M Clarke, Christopher J Langmead, Axel Legay, André Platzer, and Paolo Zuliani. 2009. A bayesian approach to model checking biological systems. In International conference on computational methods in systems biology (CMSB). 218--234.
[26]
Yufei Jiang, Qinkun Bao, Shuai Wang, Xiao Liu, and Dinghao Wu. 2018. RedDroid: Android application redundancy customization based on static analysis. In Proceedings of the 29th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 189--199.
[27]
Yufei Jiang, Dinghao Wu, and Peng Liu. 2016. JRed: Program Customization and Bloatware Mitigation Based on Static Analysis. In Proceedings of the 40th Annual Computer Software and Applications Conference (COMPSAC). IEEE, 12--21.
[28]
Neil D Jones, Carsten K Gomard, and Peter Sestoft. 1993. Partial evaluation and automatic program generation. Peter Sestoft.
[29]
Hyungjoon Koo, Seyedhamed Ghavamnia, and Michalis Polychronakis. 2019. Configuration-Driven Software Debloating. In Proceedings of the 12th European Workshop on Systems Security (EuroSec). ACM.
[30]
Pedro Larrañaga and Jose A Lozano. 2001. Estimation of distribution algorithms: A new tool for evolutionary computation. Kluwer Academic Publishers.
[31]
James R Larus. 2009. Spending Moore's dividend. Commun. ACM (2009), 62--69.
[32]
LattE 2020. LattE. https://www.math.ucdavis.edu/~latte/software.php
[33]
Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding deep compiler bugs via guided stochastic program mutation. In Proceedings of the 2015 ACM SIGPLAN international conference on Object oriented programming systems languages and applications (OOPSLA). 386--399.
[34]
Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. 2011. GenProg: A generic method for automatic software repair. IEEE Transactions on Software Engineering (TSE) (2011), 54--72.
[35]
Axel Legay, Benoît Delahaye, and Saddek Bensalem. 2010. Statistical model checking: An overview. In International conference on runtime verification. Springer, 122--135.
[36]
Han Liu, Chengnian Sun, Zhendong Su, Yu Jiang, Ming Gu, and Jiaguang Sun. 2017. Stochastic optimization of program obfuscation. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). 221--231.
[37]
llvm-cov 2020. llvm-cov. https://llvm.org/docs/CommandGuide/llvm-cov.html
[38]
Gregory Malecha, Ashish Gehani, and Natarajan Shankar. 2015. Automated software winnowing. In Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC). ACM, 1504--1511.
[39]
Aditya V Nori, Sherjil Ozair, Sriram K Rajamani, and Deepak Vijaykeerthy. [n.d.]. Efficient synthesis of probabilistic programs. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).
[40]
Justyna Petke, Saemundur O Haraldsson, Mark Harman, William B Langdon, David R White, and John R Woodward. 2017. Genetic improvement of software: a comprehensive survey. IEEE Transactions on Evolutionary Computation (TEVC) (2017), 415--432.
[41]
Chenxiong Qian, Hong Hu, Mansour Alharthi, Pak Ho Chung, Taesoo Kim, and Wenke Lee. 2019. RAZOR: A Framework for Post-deployment Software Debloating. In Proceedings of the 28th USENIX Conference on Security Symposium (USENIX Security). 1733--1750.
[42]
Anh Quach, Aravind Prakash, and Lok Yan. 2018. Debloating software through piece-wise compilation and loading. In Proceedings of the 27th {USENIX} Security Symposium ({USENIX} Security). 869--886.
[43]
Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015).
[44]
Vaibhav Rastogi, Drew Davidson, Lorenzo De Carli, Somesh Jha, and Patrick McDaniel. 2017. Cimplifier: automatically debloating containers. In Proceedings of the 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE). ACM, 476--486.
[45]
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 335--346.
[46]
Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic backpropagation and approximate inference in deep generative models. arXiv preprint arXiv:1401.4082 (2014).
[47]
Christian Robert and George Casella. 2013. Monte Carlo statistical methods. Springer Science & Business Media.
[48]
ROPgadget 2020. ROPgadget. https://github.com/JonathanSalwan/ROPgadget
[49]
Sampler 2020. Detailed Description of the Sampler Programs. https://drive.google.com/open?id=1D6-RurlAOu7RMpBxXEdGNV9pCtbu8Xuh
[50]
Adrian Sampson, Pavel Panchekha, Todd Mytkowicz, Kathryn S McKinley, Dan Grossman, and Luis Ceze. 2014. Expressing and verifying probabilistic assertions. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 112--122.
[51]
Sriram Sankaranarayanan, Aleksandar Chakarov, and Sumit Gulwani. 2013. Static analysis for probabilistic programs: inferring whole program properties from finitely many paths. In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation (PLDI). 447--458.
[52]
Eric Schkufza, Rahul Sharma, and Alex Aiken. 2013. Stochastic superoptimization. In Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 305--316.
[53]
Eric Schulte, Jonathan Dorn, Stephen Harding, Stephanie Forrest, and Westley Weimer. 2014. Post-compiler software optimization for reducing energy. In Proceedings of the 19th international conference on Architectural support for programming languages and operating systems (ASPLOS). 639--652.
[54]
Hovav Shacham et al. 2007. The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86). In ACM conference on Computer and communications security. ACM, 552--561.
[55]
Hashim Sharif, Muhammad Abubakar, Ashish Gehani, and Fareed Zaffar. 2018. TRIMMER: application specialization for code debloating. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE). ACM, 329--339.
[56]
Ting Su, Guozhu Meng, Yuting Chen, Ke Wu, Weiming Yang, Yao Yao, Geguang Pu, Yang Liu, and Zhendong Su. 2017. Guided, stochastic model-based GUI testing of Android apps. In Proceedings of the 25th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 245--256.
[57]
Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. Perses: Syntax-guided program reduction. In Proceedings of the 40th International Conference on Software Engineering (ICSE). ACM, 361--371.
[58]
Abraham Wald. 1945. Sequential tests of statistical hypotheses. The annals of mathematical statistics (1945), 117--186.
[59]
Edward J Wegman. 1972. Nonparametric probability density estimation: I. A summary of available methods. Technometrics (1972), 533--546.
[60]
Mark Weiser. 1984. Program slicing. IEEE Transactions on software engineering (TSE) (1984), 352--357.
[61]
Qi Xin, Myeongsoo Kim, Qirun Zhang, and Alessandro Orso. 2020. Program debloating via stochastic optimization. In Proceedings of the 42st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). 65--68.
[62]
Guoqing Xu, Matthew Arnold, Nick Mitchell, Atanas Rountev, and Gary Sevitsky. 2009. Go with the flow: profiling copies to find runtime bloat. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 419--430.
[63]
Guoqing Xu, Nick Mitchell, Matthew Arnold, Atanas Rountev, Edith Schonberg, and Gary Sevitsky. 2014. Scalable runtime bloat detection using abstract dynamic slicing. ACM Transactions on Software Engineering and Methodology (TOSEM) (2014).
[64]
Guoqing Xu, Nick Mitchell, Matthew Arnold, Atanas Rountev, and Gary Sevitsky. 2010. Software bloat analysis: finding, removing, and preventing performance problems in modern large-scale object-oriented applications. In Proceedings of the FSE/SDP workshop on Future of software engineering research. ACM, 421--426.
[65]
Håkan LS Younes. 2006. Error control for probabilistic model checking. In International Workshop on Verification, Model Checking, and Abstract Interpretation. Springer, 142--156.
[66]
Andreas Zeller and Ralf Hildebrandt. 2002. Simplifying and isolating failure-inducing input. IEEE Transactions on Software Engineering (TSE) (2002), 183--200.

Cited By

View all
  • (2024)Software Debloating from Exception-Handler LensesProceedings of the 2024 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3689937.3695793(19-24)Online publication date: 14-Oct-2024
  • (2024)SoK: Software Debloating Landscape and Future DirectionsProceedings of the 2024 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3689937.3695792(11-18)Online publication date: 14-Oct-2024
  • (2024)Improving Program Debloating with 1-DU Chain MinimalityProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3643518(384-385)Online publication date: 14-Apr-2024
  • Show More Cited By

Index Terms

  1. Subdomain-based generality-aware debloating

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering
    December 2020
    1449 pages
    ISBN:9781450367684
    DOI:10.1145/3324884
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    • IEEE CS

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 January 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. debloating
    2. generality-aware
    3. stochastic optimization

    Qualifiers

    • Research-article

    Funding Sources

    • Gift from Google
    • Gift from Microsoft Research
    • NSF grants
    • ONR contract

    Conference

    ASE '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 82 of 337 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)29
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 21 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Software Debloating from Exception-Handler LensesProceedings of the 2024 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3689937.3695793(19-24)Online publication date: 14-Oct-2024
    • (2024)SoK: Software Debloating Landscape and Future DirectionsProceedings of the 2024 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3689937.3695792(11-18)Online publication date: 14-Oct-2024
    • (2024)Improving Program Debloating with 1-DU Chain MinimalityProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3643518(384-385)Online publication date: 14-Apr-2024
    • (2024)Automatic build repair for test cases using incompatible java versionsInformation and Software Technology10.1016/j.infsof.2024.107473(107473)Online publication date: Apr-2024
    • (2024)SoK: A Tale of Reduction, Security, and Correctness - Evaluating Program Debloating Paradigms and Their CompositionsComputer Security – ESORICS 202310.1007/978-3-031-51482-1_12(229-249)Online publication date: 11-Jan-2024
    • (2023)Evaluating Container Debloaters2023 IEEE Secure Development Conference (SecDev)10.1109/SecDev56634.2023.00023(88-98)Online publication date: 18-Oct-2023
    • (2023)BLADE: Towards Scalable Source Code Debloating2023 IEEE Secure Development Conference (SecDev)10.1109/SecDev56634.2023.00022(75-87)Online publication date: 18-Oct-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media