More Web Proxy on the site http://driver.im/

research-article

Subdomain-based generality-aware debloating

Authors:

Alessandro OrsoAuthors Info & Claims

ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering

Pages 224 - 236

https://doi.org/10.1145/3324884.3416644

Published: 27 January 2021 Publication History

Abstract

Programs are becoming increasingly complex and typically contain an abundance of unneeded features, which can degrade the performance and security of the software. Recently, we have witnessed a surge of debloating techniques that aim to create a reduced version of a program by eliminating the unneeded features therein. To debloat a program, most existing techniques require a usage profile of the program, typically provided as a set of inputs I. Unfortunately, these techniques tend to generate a reduced program that is over-fitted to I and thus fails to behave correctly for other inputs. To address this limitation, we propose DomGad, which has two main advantages over existing debloating approaches. First, it produces a reduced program that is guaranteed to work for subdomains, rather than for specific inputs. Second, it uses stochastic optimization to generate reduced programs that achieve a close-to-optimal tradeoff between reduction and generality (i.e., the extent to which the reduced program is able to correctly handle inputs in its whole domain). To assess the effectiveness of DomGad, we applied our approach to a benchmark of ten Unix utility programs. Our results are promising, as they show that DomGad could produce debloated programs that achieve, on average, 50% code reduction and 95% generality. Our results also show that DomGad performs well when compared with two state-of-the-art debloating approaches.

References

[1]

Ioannis Agadakos, Di Jin, David Williams-King, Vasileios P Kemerlis, and Georgios Portokalidis. 2019. Nibbler: debloating binary shared libraries. In Proceedings of the 35th Annual Computer Security Applications Conference (ACSAC). 70--83.

Digital Library

[2]

Abdulbaki Aydin, William Eiers, Lucas Bang, Tegan Brennan, Miroslav Gavrilov, Tevfik Bultan, and Fang Yu. 2018. Parameterized model counting for string and numeric constraints. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 400--410.

Digital Library

[3]

Babak Amin Azad, Pierre Laperdrix, and Nick Nikiforakis. 2019. Less is more: quantifying the security benefits of debloating web applications. In 28th USENIX Security Symposium (USENIX Security 19). 1697--1714.

[4]

Suparna Bhattacharya, Kanchi Gopinath, and Mangala Gowri Nanda. 2013. Combining concern input with program analysis for bloat detection. In Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages and applications (OOPSLA). ACM, 745--764.

Digital Library

[5]

Mateus Borges, Antonio Filieri, Marcelo d'Amorim, Corina S Păsăreanu, and Willem Visser. 2014. Compositional solution space quantification for probabilistic software analysis. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 123--132.

Digital Library

[6]

Erik Buchanan, Ryan Roemer, Hovav Shacham, and Stefan Savage. 2008. When good instructions go bad: Generalizing return-oriented programming to RISC. In Proceedings of the 15th ACM conference on Computer and communications security (CCS). ACM, 27--38.

Digital Library

[7]

Supratik Chakraborty, Kuldeep S Meel, Rakesh Mistry, and Moshe Y Vardi. 2016. Approximate probabilistic inference via word-level counting. In Thirtieth AAAI Conference on Artificial Intelligence (AAAI).

Digital Library

[8]

Yuting Chen, Ting Su, and Zhendong Su. 2019. Deep differential testing of JVM implementations. In Proceedings of the 41th International Conference on Software Engineering (ICSE). 1257--1268.

Digital Library

[9]

Herman Chernoff. 1952. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. The Annals of Mathematical Statistics (1952), 493--507.

[10]

Chisel 2020. Chisel. https://github.com/aspire-project/chisel (accessed on September 2020).

[11]

ChiselBench 2020. ChiselBench. https://github.com/aspire-project/chisel-bench (accessed on September 2020).

[12]

Arpit Christi, Alex Groce, and Rahul Gopinath. 2017. Resource adaptation via test-based software minimization. In 11th International Conference on Self-Adaptive and Self-Organizing Systems (SASO). 61--70.

[13]

Clang 2020. Clang: a C language family frontend for LLVM. https://clang.llvm.org/ (accessed on September 2020).

[14]

Debop 2020. Debop: Program Debloating via Stochastic Optimization. https://sites.google.com/view/debop19 (accessed on September 2020).

[15]

Kostas Ferles, Valentin Wüstholz, Maria Christakis, and Isil Dillig. 2017. Failure-directed program trimming. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE). 174--185.

Digital Library

[16]

Antonio Filieri, Corina S Păsăreanu, Willem Visser, and Jaco Geldenhuys. 2014. Statistical symbolic execution with informed sampling. In Proc. of the ACM SIGSOFT Intl. Symposium on Foundations of Software Engineering (FSE). 437--448.

Digital Library

[17]

Stuart Geman and Donald Geman. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (1984), 721--741.

[18]

Walter R Gilks, Sylvia Richardson, and David Spiegelhalter. 1995. Markov chain Monte Carlo in practice. Chapman and Hall/CRC.

[19]

Carla P Gomes, Ashish Sabharwal, and Bart Selman. 2006. Model counting: A new strategy for obtaining good bounds. In Conference on Artificial Intelligence (AAAI). 54--61.

[20]

Christian Gram Kalhauge and Jens Palsberg. 2019. Binary reduction of dependency graphs. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 556--566.

Digital Library

[21]

Roman Haas, Rainer Niedermayr, Tobias Roehm, and Sven Apel. 2020. Is Static Analysis Able to Identify Unnecessary Source Code? ACM Transactions on Software Engineering and Methodology (TOSEM) (2020), 1--23.

[22]

Kihong Heo, Woosuk Lee, Pardis Pashakhanloo, and Mayur Naik. 2018. Effective Program Debloating via Reinforcement Learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS). 380--394.

Digital Library

[23]

Curt Hibbs, Steve Jewett, and Mike Sullivan. 2009. The art of lean software development: a practical and incremental approach. "O'Reilly Media, Inc.".

[24]

Jianjun Huang, Yousra Aafer, David Perry, Xiangyu Zhang, and Chen Tian. 2017. UI driven Android application reduction. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 286--296.

Digital Library

[25]

Sumit K Jha, Edmund M Clarke, Christopher J Langmead, Axel Legay, André Platzer, and Paolo Zuliani. 2009. A bayesian approach to model checking biological systems. In International conference on computational methods in systems biology (CMSB). 218--234.

Digital Library

[26]

Yufei Jiang, Qinkun Bao, Shuai Wang, Xiao Liu, and Dinghao Wu. 2018. RedDroid: Android application redundancy customization based on static analysis. In Proceedings of the 29th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 189--199.

[27]

Yufei Jiang, Dinghao Wu, and Peng Liu. 2016. JRed: Program Customization and Bloatware Mitigation Based on Static Analysis. In Proceedings of the 40th Annual Computer Software and Applications Conference (COMPSAC). IEEE, 12--21.

[28]

Neil D Jones, Carsten K Gomard, and Peter Sestoft. 1993. Partial evaluation and automatic program generation. Peter Sestoft.

[29]

Hyungjoon Koo, Seyedhamed Ghavamnia, and Michalis Polychronakis. 2019. Configuration-Driven Software Debloating. In Proceedings of the 12th European Workshop on Systems Security (EuroSec). ACM.

Digital Library

[30]

Pedro Larrañaga and Jose A Lozano. 2001. Estimation of distribution algorithms: A new tool for evolutionary computation. Kluwer Academic Publishers.

[31]

James R Larus. 2009. Spending Moore's dividend. Commun. ACM (2009), 62--69.

[32]

LattE 2020. LattE. https://www.math.ucdavis.edu/~latte/software.php

[33]

Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding deep compiler bugs via guided stochastic program mutation. In Proceedings of the 2015 ACM SIGPLAN international conference on Object oriented programming systems languages and applications (OOPSLA). 386--399.

Digital Library

[34]

Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. 2011. GenProg: A generic method for automatic software repair. IEEE Transactions on Software Engineering (TSE) (2011), 54--72.

[35]

Axel Legay, Benoît Delahaye, and Saddek Bensalem. 2010. Statistical model checking: An overview. In International conference on runtime verification. Springer, 122--135.

Digital Library

[36]

Han Liu, Chengnian Sun, Zhendong Su, Yu Jiang, Ming Gu, and Jiaguang Sun. 2017. Stochastic optimization of program obfuscation. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). 221--231.

Digital Library

[37]

llvm-cov 2020. llvm-cov. https://llvm.org/docs/CommandGuide/llvm-cov.html

[38]

Gregory Malecha, Ashish Gehani, and Natarajan Shankar. 2015. Automated software winnowing. In Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC). ACM, 1504--1511.

Digital Library

[39]

Aditya V Nori, Sherjil Ozair, Sriram K Rajamani, and Deepak Vijaykeerthy. [n.d.]. Efficient synthesis of probabilistic programs. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).

[40]

Justyna Petke, Saemundur O Haraldsson, Mark Harman, William B Langdon, David R White, and John R Woodward. 2017. Genetic improvement of software: a comprehensive survey. IEEE Transactions on Evolutionary Computation (TEVC) (2017), 415--432.

[41]

Chenxiong Qian, Hong Hu, Mansour Alharthi, Pak Ho Chung, Taesoo Kim, and Wenke Lee. 2019. RAZOR: A Framework for Post-deployment Software Debloating. In Proceedings of the 28th USENIX Conference on Security Symposium (USENIX Security). 1733--1750.

[42]

Anh Quach, Aravind Prakash, and Lok Yan. 2018. Debloating software through piece-wise compilation and loading. In Proceedings of the 27th {USENIX} Security Symposium ({USENIX} Security). 869--886.

[43]

Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015).

[44]

Vaibhav Rastogi, Drew Davidson, Lorenzo De Carli, Somesh Jha, and Patrick McDaniel. 2017. Cimplifier: automatically debloating containers. In Proceedings of the 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE). ACM, 476--486.

Digital Library

[45]

John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 335--346.

Digital Library

[46]

Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic backpropagation and approximate inference in deep generative models. arXiv preprint arXiv:1401.4082 (2014).

[47]

Christian Robert and George Casella. 2013. Monte Carlo statistical methods. Springer Science & Business Media.

[48]

ROPgadget 2020. ROPgadget. https://github.com/JonathanSalwan/ROPgadget

[49]

Sampler 2020. Detailed Description of the Sampler Programs. https://drive.google.com/open?id=1D6-RurlAOu7RMpBxXEdGNV9pCtbu8Xuh

[50]

Adrian Sampson, Pavel Panchekha, Todd Mytkowicz, Kathryn S McKinley, Dan Grossman, and Luis Ceze. 2014. Expressing and verifying probabilistic assertions. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 112--122.

Digital Library

[51]

Sriram Sankaranarayanan, Aleksandar Chakarov, and Sumit Gulwani. 2013. Static analysis for probabilistic programs: inferring whole program properties from finitely many paths. In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation (PLDI). 447--458.

Digital Library

[52]

Eric Schkufza, Rahul Sharma, and Alex Aiken. 2013. Stochastic superoptimization. In Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 305--316.

Digital Library

[53]

Eric Schulte, Jonathan Dorn, Stephen Harding, Stephanie Forrest, and Westley Weimer. 2014. Post-compiler software optimization for reducing energy. In Proceedings of the 19th international conference on Architectural support for programming languages and operating systems (ASPLOS). 639--652.

Digital Library

[54]

Hovav Shacham et al. 2007. The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86). In ACM conference on Computer and communications security. ACM, 552--561.

[55]

Hashim Sharif, Muhammad Abubakar, Ashish Gehani, and Fareed Zaffar. 2018. TRIMMER: application specialization for code debloating. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE). ACM, 329--339.

Digital Library

[56]

Ting Su, Guozhu Meng, Yuting Chen, Ke Wu, Weiming Yang, Yao Yao, Geguang Pu, Yang Liu, and Zhendong Su. 2017. Guided, stochastic model-based GUI testing of Android apps. In Proceedings of the 25th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 245--256.

Digital Library

[57]

Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. Perses: Syntax-guided program reduction. In Proceedings of the 40th International Conference on Software Engineering (ICSE). ACM, 361--371.

Digital Library

[58]

Abraham Wald. 1945. Sequential tests of statistical hypotheses. The annals of mathematical statistics (1945), 117--186.

[59]

Edward J Wegman. 1972. Nonparametric probability density estimation: I. A summary of available methods. Technometrics (1972), 533--546.

[60]

Mark Weiser. 1984. Program slicing. IEEE Transactions on software engineering (TSE) (1984), 352--357.

Digital Library

[61]

Qi Xin, Myeongsoo Kim, Qirun Zhang, and Alessandro Orso. 2020. Program debloating via stochastic optimization. In Proceedings of the 42st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). 65--68.

Digital Library

[62]

Guoqing Xu, Matthew Arnold, Nick Mitchell, Atanas Rountev, and Gary Sevitsky. 2009. Go with the flow: profiling copies to find runtime bloat. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 419--430.

Digital Library

[63]

Guoqing Xu, Nick Mitchell, Matthew Arnold, Atanas Rountev, Edith Schonberg, and Gary Sevitsky. 2014. Scalable runtime bloat detection using abstract dynamic slicing. ACM Transactions on Software Engineering and Methodology (TOSEM) (2014).

[64]

Guoqing Xu, Nick Mitchell, Matthew Arnold, Atanas Rountev, and Gary Sevitsky. 2010. Software bloat analysis: finding, removing, and preventing performance problems in modern large-scale object-oriented applications. In Proceedings of the FSE/SDP workshop on Future of software engineering research. ACM, 421--426.

Digital Library

[65]

Håkan LS Younes. 2006. Error control for probabilistic model checking. In International Workshop on Verification, Model Checking, and Abstract Interpretation. Springer, 142--156.

[66]

Andreas Zeller and Ralf Hildebrandt. 2002. Simplifying and isolating failure-inducing input. IEEE Transactions on Software Engineering (TSE) (2002), 183--200.

Digital Library

Cited By

Alhanahnah MJhumka ACraven RMickelson M(2024)Software Debloating from Exception-Handler LensesProceedings of the 2024 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3689937.3695793(19-24)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3689937.3695793
Alhanahnah MBoshmaf YGehani ACraven RMickelson M(2024)SoK: Software Debloating Landscape and Future DirectionsProceedings of the 2024 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3689937.3695792(11-18)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3689937.3695792
Kim MPande SOrso ARoychoudhury APaiva AAbreu RStorey M(2024)Improving Program Debloating with 1-DU Chain MinimalityProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3643518(384-385)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639478.3643518
Show More Cited By

Index Terms

Subdomain-based generality-aware debloating
1. Software and its engineering
  1. Software notations and tools
    1. Software maintenance tools

Recommendations

Studying and Understanding the Tradeoffs Between Generality and Reduction in Software Debloating
ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

Existing approaches for program debloating often use a usage profile, typically provided as a set of inputs, for identifying the features of a program to be preserved. Specifically, given a program and a set of inputs, these techniques produce a reduced ...
Slimium: Debloating the Chromium Browser with Feature Subsetting
CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security

Today, a web browser plays a crucial role in offering a broad spectrum of web experiences. The most popular browser, Chromium, has become an extremely complex application to meet ever-increasing user demands, exposing unavoidably large attack vectors ...
Effective Program Debloating via Reinforcement Learning
CCS '18: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security

Prevalent software engineering practices such as code reuse and the "one-size-fits-all" methodology have contributed to significant and widespread increases in the size and complexity of software. The resulting software bloat has led to decreased ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering

December 2020

1449 pages

ISBN:9781450367684

DOI:10.1145/3324884

General Chair:
John Grundy,
Program Chairs:
Claire Le Goues,
David Lo

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 January 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Gift from Google
Gift from Microsoft Research
NSF grants
ONR contract

Conference

ASE '20

Sponsor:

ASE '20: 35th IEEE/ACM International Conference on Automated Software Engineering

December 21 - 25, 2020

Virtual Event, Australia

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
205
Total Downloads

Downloads (Last 12 months)29
Downloads (Last 6 weeks)4

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Alhanahnah MJhumka ACraven RMickelson M(2024)Software Debloating from Exception-Handler LensesProceedings of the 2024 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3689937.3695793(19-24)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3689937.3695793
Alhanahnah MBoshmaf YGehani ACraven RMickelson M(2024)SoK: Software Debloating Landscape and Future DirectionsProceedings of the 2024 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3689937.3695792(11-18)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3689937.3695792
Kim MPande SOrso ARoychoudhury APaiva AAbreu RStorey M(2024)Improving Program Debloating with 1-DU Chain MinimalityProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3643518(384-385)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639478.3643518
Mak CCheung S(2024)Automatic build repair for test cases using incompatible java versionsInformation and Software Technology10.1016/j.infsof.2024.107473(107473)Online publication date: Apr-2024
https://doi.org/10.1016/j.infsof.2024.107473
Ali MMuzammil MKarim FNaeem AHaroon RHaris MNadeem HSabir WShaon FZaffar FYegneswaran VGehani ARahaman S(2024)SoK: A Tale of Reduction, Security, and Correctness - Evaluating Program Debloating Paradigms and Their CompositionsComputer Security – ESORICS 202310.1007/978-3-031-51482-1_12(229-249)Online publication date: 11-Jan-2024
https://doi.org/10.1007/978-3-031-51482-1_12
Hassan MTahir TFarrukh MNaveed ANaeem AZaffar FShaon FGehani ARahaman S(2023)Evaluating Container Debloaters2023 IEEE Secure Development Conference (SecDev)10.1109/SecDev56634.2023.00023(88-98)Online publication date: 18-Oct-2023
https://doi.org/10.1109/SecDev56634.2023.00023
Ali MHabib RGehani ARahaman SUzmi Z(2023)BLADE: Towards Scalable Source Code Debloating2023 IEEE Secure Development Conference (SecDev)10.1109/SecDev56634.2023.00022(75-87)Online publication date: 18-Oct-2023
https://doi.org/10.1109/SecDev56634.2023.00022

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents