[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Predictive Program Slicing via Execution Knowledge-Guided Dynamic Dependence Learning

Published: 12 July 2024 Publication History

Abstract

Program slicing, the process of extracting program statements that influence values at a designated location (known as the slicing criterion), is helpful in both manual and automated debugging. However, such slicing techniques prove ineffective in scenarios where executing specific inputs is prohibitively expensive, or even impossible, as with partial code. In this paper, we introduce ND-Slicer, a predictive slicing methodology that caters to specific executions based on a particular input, overcoming the need for actual execution. We enable such a process by leveraging execution-aware pre-training to learn the dynamic program dependencies, including both dynamic data and control dependencies between variables in the slicing criterion and the remaining program statements. Such knowledge forms the cornerstone for constructing a predictive backward slice. Our empirical evaluation revealed a high accuracy in predicting program slices, achieving an exact-match accuracy of 81.3% and a ROUGE-LCS F1-score of 95.4% on Python programs. As an extrinsic evaluation, we illustrate ND-Slicer’s usefulness in crash detection, with it locating faults with an accuracy of 63.9%. Furthermore, we include an in-depth qualitative evaluation, assessing ND-Slicer’s understanding of branched structures such as if-else blocks and loops, as well as the control flow in inter-procedural calls.

References

[1]
[n. d.]. https://github.com/AlaFalaki/AttentionVisualizer.
[2]
2022. A static analysis library for computing graph representations of Python programs. https://github.com/google-research/python-graphs
[3]
2023. Replication package for ND-Slicer. https://github.com/aashishyadavally/nd-slicer
[4]
Hiralal Agrawal and Joseph R. Horgan. 1990. Dynamic program slicing. In Proceedings of the ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation (PLDI ’90). Association for Computing Machinery, New York, NY, USA. 246–256. isbn:0897913647 https://doi.org/10.1145/93542.93576
[5]
Esben Andreasen, Liang Gong, Anders Møller, Michael Pradel, Marija Selakovic, Koushik Sen, and Cristian-Alexandru Staicu. 2017. A Survey of Dynamic Analysis and Test Generation for JavaScript. ACM Comput. Surv., 50, 5 (2017), Article 66, sep, 36 pages. issn:0360-0300 https://doi.org/10.1145/3106739
[6]
David Binkley and Keith Gallagher. 1996. Program Slicing. Journal of Advanced Computing, 43 (1996), 1–50.
[7]
David Binkley, Nicolas Gold, Mark Harman, Syed Islam, Jens Krinke, and Shin Yoo. 2014. ORBS: Language-independent Program Slicing. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 109–120.
[8]
David Binkley and Mark Harman. 2004. A survey of empirical results on program slicing. Journal of Advanced Computing, 62 (2004), 105–178.
[9]
Derek Bruening, Timothy Garnett, and Saman Amarasinghe. 2003. An infrastructure for adaptive dynamic optimization. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization (CGO ’03). IEEE Computer Society, USA. 265–275. isbn:076951913X
[10]
G. Canfora, A. Cimitile, and A. De Lucia. 1998. Conditioned Program Slicing. Inf. Soft. Technology, 40, 11-12 (1998), 595–608.
[11]
Andrea de Lucia, Anna Rita Fasolino, and Malcolm Munro. 1996. Understanding Function Behaviors Through Program Slicing. In Proceedings of the 4th International Workshop on Program Comprehension. IEEE Computer Society, 9–18.
[12]
Yangruibo Ding, Benjamin Steenhoek, Kexin Pei, Gail E. Kaiser, Wei Le, and Baishakhi Ray. 2024. TRACED: Execution-aware Pre-training for Source Code. In Proceedings of the International Conference on Software Engineering (ICSE’24). ACM Press.
[13]
John Field, G. Ramalingam, and Frank Tip. 1995. Parametric Program Slicing. In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, 379–392.
[14]
Keith Gallagher, David Binkley, and Mark Harman. 2006. Stop-List Slicing. In Proceedings of the Sixth IEEE International Workshop on Source Code Analysis and Manipulation. IEEE Computer Society, 11–20.
[15]
Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin B. Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, and Ming Zhou. 2021. GraphCodeBERT: Pre-training Code Representations with Data Flow. In ICLR. OpenReview.net.
[16]
Mark Harman and Sebastian Danicic. 1997. Amorphous Program Slicing. In Proceedings of the 5th International Workshop on Program Comprehension. IEEE Computer Society, 70–79.
[17]
Mark Harman, Sebastian Danicic, Yoga Sivagurunathan, and Dan Simpson. 1996. The Next 700 Slicing Criteria. In Proceedings of the 2nd U.K. Workshop on Program Comprehension.
[18]
Mark Harman and Keith Gallagher. 1998. Program Slicing. Inform. Softw. Technol., 40 (1998), 577–582.
[19]
Mark Harman and Robert Hierons. 2001. An overview of program slicing. Softw. Focus, 3 (2001), 85–92.
[20]
Mark Harman, Robert Hierons, Chris Fox, Sebastian Danicic, and John Howroyd. 2001. Pre/Post Conditioned Slicing. In Proceedings of the IEEE International Conference on Software Maintenance. IEEE Computer Society, 138–147.
[21]
John Hatcliff, Matthew B. Dwyer, and Hongjun Zheng. 2000. Slicing Software for Model Construction. Higher Order Symbolic Computation, 13, 4 (2000), 315–353.
[22]
Ranjit Jhala and Rupak Majumdar. 2005. Path Slicing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 38–47.
[23]
Bogdan Korel and Janusz Laski. 1988. Dynamic Program Slicing. Inf. Process. Lett., 29, 3 (1988), Oct., 155–163.
[24]
Chenxiao Liu, Shuai Lu, Weizhu Chen, Daxin Jiang, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan, and Nan Duan. 2023. Code Execution with Pre-trained Language Models. In ACL (Findings). Association for Computational Linguistics, 4984–4999.
[25]
Tongping Liu, Charlie Curtsinger, and Emery D. Berger. 2016. DoubleTake: fast and precise error detection via evidence-based dynamic analysis. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16). Association for Computing Machinery, New York, NY, USA. 911–922. isbn:9781450339001 https://doi.org/10.1145/2884781.2884784
[26]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR, abs/1907.11692 (2019).
[27]
Andrea De Lucia. 2001. Program slicing: Methods and applications. In Proceedings of the 1st IEEE International Workshop on Source Code Analysis and Manipulation. IEEE Computer Society, 142–149.
[28]
J. Maras, J. Carlson, and I. Crnkovic. 2011. Client-side web application slicing. In 26th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE Press, 504–507.
[29]
Akira Nishimatsu, Minoru Jihira, Shinji Kusumoto, and Katsuro Inoue. 1999. Call-mark Slicing: An Efficient and Economical Way of Reducing Slice. In Proceedings of the 21st International Conference on Software Engineering. ACM, 422–431.
[30]
Alex Orso, Saurabh Sinha, and Mary Jean Harrold. 2001. Incremental slicing based on data-dependence types. In Proceedings of the IEEE International Conference on Software Maintenance. IEEE Computer Society, 158–167.
[31]
Nikhil Prabhu, Sewoong Min, Haewoon Nam, Girma Tewolde, and Jaerock Kwon. 2020. Integrated Framework of Autonomous Vehicle with Traffic Sign Recognition in Simulation Environment. In 2020 IEEE International Conference on Electro Information Technology (EIT). 514–521. https://doi.org/10.1109/EIT48999.2020.9208241
[32]
Ruchir Puri, David S. Kung, Geert Janssen, Wei Zhang, Giacomo Domeniconi, Vladimir Zolotov, Julian Dolby, Jie Chen, Mihir R. Choudhury, Lindsey Decker, Veronika Thost, Luca Buratti, Saurabh Pujar, and Ulrich Finkler. 2021. Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks. CoRR, abs/2105.12655 (2021).
[33]
Gregor Richards, Sylvain Lebresne, Brian Burg, and Jan Vitek. 2010. An analysis of the dynamic behavior of JavaScript programs. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’10). Association for Computing Machinery, New York, NY, USA. 1–12. isbn:9781450300193 https://doi.org/10.1145/1806596.1806598
[34]
Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In ACL (1). Association for Computational Linguistics, 1073–1083.
[35]
Josep Silva. 2012. A Vocabulary of Program Slicing-based Techniques. ACM Comput. Surv., 44, 3 (2012), Article 12, June, 41 pages.
[36]
Beatriz Souza and Michael Pradel. 2023. LExecutor: Learning-Guided Execution. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2023). Association for Computing Machinery, New York, NY, USA. 1522–1534. isbn:9798400703270 https://doi.org/10.1145/3611643.3616254
[37]
Frank Tip. 1994. A Survey of Program Slicing Techniques. Amsterdam, The Netherlands.
[38]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). 30, Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
[39]
Ratnadira Widyasari, Sheng Qin Sim, Camellia Lok, Haodi Qi, Jack Phan, Qijin Tay, Constance Tan, Fiona Wee, Jodie Ethelda Tan, Yuheng Yieh, Brian Goh, Ferdian Thung, Hong Jin Kang, Thong Hoang, David Lo, and Eng Lieh Ouh. 2020. BugsInPy: a database of existing bugs in Python programs to enable controlled testing and debugging studies. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA. 1556–1560. isbn:9781450370431 https://doi.org/10.1145/3368089.3417943
[40]
Baowen Xu, Zhenqiang Chen, and Hongji Yang. 2002. Dynamic slicing object-oriented programs for debugging. In Proceedings. Second IEEE International Workshop on Source Code Analysis and Manipulation. 115–122. https://doi.org/10.1109/SCAM.2002.1134111
[41]
Baowen Xu, Ju Qian, Xiaofang Zhang, Zhongqiang Wu, and Lin Chen. 2005. A brief survey of program slicing. SIGSOFT Softw. Eng. Notes, 30, 2 (2005), mar, 1–36. issn:0163-5948 https://doi.org/10.1145/1050849.1050865
[42]
Aashish Yadavally, Tien N. Nguyen, Wenbo Wang, and Shaohua Wang. 2023. (Partial) Program Dependence Learning. In Proceedings of the 45th International Conference on Software Engineering (ICSE ’23). IEEE Press, 2501–2513. isbn:9781665457019 https://doi.org/10.1109/ICSE48619.2023.00209
[43]
Tianyi Zhang, Di Yang, Crista Lopes, and Miryung Kim. 2019. Analyzing and Supporting Adaptation of Online Code Examples. In Proceedings of the 41st International Conference on Software Engineering (ICSE ’19). IEEE Press, 316–327. https://doi.org/10.1109/ICSE.2019.00046

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Software Engineering
Proceedings of the ACM on Software Engineering  Volume 1, Issue FSE
July 2024
2770 pages
EISSN:2994-970X
DOI:10.1145/3554322
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2024
Published in PACMSE Volume 1, Issue FSE

Badges

  • Distinguished Paper

Author Tags

  1. AI4SE
  2. Dynamic Dependence Learning
  3. Dynamic Slicing
  4. Execution-Guided Learning
  5. Large Language Models
  6. Neural Networks

Qualifiers

  • Research-article

Funding Sources

  • National Science Foundation
  • National Security Agency

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 264
    Total Downloads
  • Downloads (Last 12 months)264
  • Downloads (Last 6 weeks)69
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media