[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3238147.3238211acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Datalog-based scalable semantic diffing of concurrent programs

Published: 03 September 2018 Publication History

Abstract

When an evolving program is modified to address issues related to thread synchronization, there is a need to confirm the change is correct, i.e., it does not introduce unexpected behavior. However, manually comparing two programs to identify the semantic difference is labor intensive and error prone, whereas techniques based on model checking are computationally expensive. To fill the gap, we develop a fast and approximate static analysis for computing synchronization differences of two programs. The method is fast because, instead of relying on heavy-weight model checking techniques, it leverages a polynomial-time Datalog-based program analysis framework to compute differentiating data-flow edges, i.e., edges allowed by one program but not the other. Although approximation is used our method is sufficiently accurate due to careful design of the Datalog inference rules and iterative increase of the required data-flow edges for representing a difference. We have implemented our method and evaluated it on a large number of multithreaded C programs to confirm its ability to produce, often within seconds, the same differences obtained by human; in contrast, prior techniques based on model checking take minutes or even hours and thus can be 10x to 1000x slower.

References

[1]
Gcc bug 21334. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21334.
[2]
Gcc bug 24430. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25330.
[3]
Gcc bug 3584. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3584.
[4]
Gcc bug 40518. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40518.
[5]
Glib bug 51264.
[6]
https://bugzilla.gnome.org/show_bug.cgi?id=512624.
[7]
Jetty bug 1187.
[8]
https://jira.codejaus.org/browse/JET TY- 1187.
[9]
Llvm bug 8441. http://llvm.org/bugs/show_bug.cgi?id=8441.
[10]
Martín Abadi and Leslie Lamport. The existence of refinement mappings. Theoretical Computer Science, 82(2):253–284, May 1991.
[11]
Vikram Adve, Chris Lattner, Michael Brukman, Anand Shukla, and Brian Gaeke. LLVA: A Low-level Virtual Instruction Set Architecture. In ACM/IEEE international symposium on Microarchitecture, Dec 2003.
[12]
Aws Albarghouthi, Paraschos Koutris, Mayur Naik, and Calvin Smith. Constraintbased synthesis of datalog programs. In International Conference on Principles and Practice of Constraint Programming, pages 689–706, 2017.
[13]
Dirk Beyer. Software verification and verifiable witnesses. In International Conference on Tools and Algorithms for Construction and Analysis of Systems, pages 401–416, 2015.
[14]
Sandeep Bindal, Sorav Bansal, and Akash Lal. Variable and thread bounding for systematic testing of multithreaded programs. In International Symposium on Software Testing and Analysis, pages 145–155, 2013.
[15]
Roderick Bloem, Georg Hofferek, Bettina Könighofer, Robert Könighofer, Simon Außerlechner, and Raphael Spörk. Synthesis of synchronization using uninterpreted functions. In International Conference on Formal Methods in Computer-Aided Design, pages 11:35–11:42, 2014.
[16]
Ahmed Bouajjani, Constantin Enea, and Shuvendu K. Lahiri. Abstract Semantic Diffing of Evolving Concurrent Programs, pages 46–65. Springer International Publishing, Cham, 2017.
[17]
Martin Bravenboer and Yannis Smaragdakis. Strictly declarative specification of sophisticated points-to analyses. In ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages, and Applications, pages 243–262, 2009.
[18]
Steven Dawson, C. R. Ramakrishnan, and David S. Warren. Practical program analysis using general purpose logic programming systems—a case study. In ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 117–126, 1996.
[19]
Azadeh Farzan and Zachary Kincaid. Verification of parameterized concurrent programs by modular reasoning about data and control. In ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pages 297–308, 2012.
[20]
Benny Godlin and Ofer Strichman. Time for verification. chapter Inference Rules for Proving the Equivalence of Recursive Procedures, pages 167–184. Springer-Verlag, Berlin, Heidelberg, 2010.
[21]
Shengjian Guo, Markus Kusano, and Chao Wang. Conc-iSE: Incremental symbolic execution of concurrent software. In IEEE/ACM International Conference On Automated Software Engineering, pages 531–542, 2016.
[22]
Shengjian Guo, Markus Kusano, Chao Wang, Zijiang Yang, and Aarti Gupta. Assertion guided symbolic execution of multithreaded programs. In ACM SIGSOFT Symposium on Foundations of Software Engineering, pages 854–865, 2015.
[23]
Elnar Hajiyev, Mathieu Verbaere, and Oege de Moor. CodeQuest: Scalable source code queries with datalog. In European Conference on Object-Oriented Programming, pages 2–27, 2006.
[24]
Nevin Heintze and Olivier Tardieu. Demand-driven pointer analysis. In ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 24–34, 2001.
[25]
Maurice Herlihy and Nir Shavit. The Art of Multiprocessor Programming. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2008.
[26]
Krystof Hoder, Nikolaj Bjørner, and Leonardo de Moura. muZ - an efficient engine for fixed points with constraints. In International Conference on Computer Aided Verification, pages 457–462, 2011.
[27]
Susan Horwitz, Thomas Reps, and Mooly Sagiv. Demand interprocedural dataflow analysis. In ACM SIGSOFT Symposium on Foundations of Software Engineering, pages 104–115, 1995.
[28]
Daniel Jackson and David A. Ladd. Semantic diff: A tool for summarizing the effects of modifications. In International Conference on Software Maintenance, pages 243–252, 1994.
[29]
Vilas Jagannath, Qingzhou Luo, and Darko Marinov. Change-aware preemption prioritization. In International Symposium on Software Testing and Analysis, pages 133–143, 2011.
[30]
Saurabh Joshi, Shuvendu K. Lahiri, and Akash Lal. Underspecified harnesses and interleaved bugs. In ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pages 19–30, 2012.
[31]
Sepideh Khoshnood, Markus Kusano, and Chao Wang. Concbugassist: Constraint solving for diagnosis and repair of concurrency bugs. In International Symposium on Software Testing and Analysis, pages 165–176, 2015.
[32]
Markus Kusano and Chao Wang. Assertion guided abstraction: A cooperative optimization for dynamic partial order reduction. In IEEE/ACM International Conference On Automated Software Engineering, pages 175–186, 2014.
[33]
Markus Kusano and Chao Wang. Flow-sensitive composition of thread-modular abstract interpretation. In ACM SIGSOFT Symposium on Foundations of Software Engineering, pages 799–809, 2016.
[34]
Markus J. Kusano and Chao Wang. Thread-modular static analysis for relaxed memory models. In ACM SIGSOFT Symposium on Foundations of Software Engineering, 2017.
[35]
Shuvendu K. Lahiri, Chris Hawblitzel, Ming Kawaguchi, and Henrique Rebêlo. SYMDIFF: A language-agnostic semantic diff tool for imperative programs. In International Conference on Computer Aided Verification, pages 712–717, 2012.
[36]
Shuvendu K. Lahiri, Kenneth L. McMillan, Rahul Sharma, and Chris Hawblitzel. Differential assertion checking. In ACM SIGSOFT Symposium on Foundations of Software Engineering, pages 345–355, 2013.
[37]
Monica S. Lam, John Whaley, V. Benjamin Livshits, Michael C. Martin, Dzintars Avots, Michael Carbin, and Christopher Unkel. Context-sensitive program analysis as database queries. In ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 1–12, 2005.
[38]
Steffen Lehnert. A taxonomy for software change impact analysis. In International Workshop on Principles of Software Evolution and Annual ERCIM Workshop on Software Evolution, pages 41–50, 2011.
[39]
V. Benjamin Livshits and Monica S. Lam. Finding security vulnerabilities in java applications with static analysis. In USENIX Security Symposium, pages 18–18, 2005.
[40]
Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. Learning from mistakes: A comprehensive study on real world concurrency bug characteristics. In International Conference on Architectural Support for Programming Languages and Operating Systems, pages 329–339, 2008.
[41]
Paul Dan Marinescu and Cristian Cadar. KATCH: High-coverage testing of software patches. In ACM SIGSOFT Symposium on Foundations of Software Engineering, pages 235–245, 2013.
[42]
Madanlal Musuvathi, Shaz Qadeer, Thomas Ball, Gérard Basler, Piramanayagam Arumuga Nainar, and Iulian Neamtiu. Finding and reproducing heisenbugs in concurrent programs. In USENIX Symposium on Operating Systems Design and Implementation, pages 267–280, 2008.
[43]
Mayur Naik, Alex Aiken, and John Whaley. Effective static race detection for java. In ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 308–319, 2006.
[44]
Suzette Person, Matthew B. Dwyer, Sebastian Elbaum, and Corina S. P ˇ as ˇ areanu. Differential symbolic execution. In ACM SIGSOFT Symposium on Foundations of Software Engineering, pages 226–237, 2008.
[45]
G. Ramalingam. Context-sensitive synchronization-sensitive analysis is undecidable. ACM Trans. Program. Lang. Syst., 22(2):416–430, 2000.
[46]
Dennis Shasha and Marc Snir. Efficient and correct execution of parallel programs that share memory. ACM Trans. Program. Lang. Syst., 10(2):282–312, 1988.
[47]
Chungha Sung, Markus Kusano, Nishant Sinha, and Chao Wang. Static DOM event dependency analysis for testing web applications. In ACM SIGSOFT Symposium on Foundations of Software Engineering, pages 447–459, 2016.
[48]
Chungha Sung, Markus Kusano, and Chao Wang. Modular verification of interrupt-driven software. In IEEE/ACM International Conference On Automated Software Engineering, pages 206–216, 2017.
[49]
Chao Wang, Yu Yang, Aarti Gupta, and Ganesh Gopalakrishnan. Dynamic model checking with property driven pruning to detect race conditions. In International Symposium on Automated Technology for Verification and Analysis, pages 126–140, 2008.
[50]
John Whaley and Monica S. Lam. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 131–144, 2004.
[51]
Yu Yang, Xiaofang Chen, and Ganesh Gopalakrishnan. Inspect: A runtime model checker for multithreaded C programs. Technical report, University of Utah, 2008.
[52]
Yu Yang, Xiaofang Chen, Ganesh Gopalakrishnan, and Chao Wang. Automatic discovery of transition symmetry in multithreaded programs using dynamic analysis. In International SPIN Workshop on Model Checking Software, pages 279–295, 2009.
[53]
Zuoning Yin, Ding Yuan, Yuanyuan Zhou, Shankar Pasupathy, and Lakshmi Bairavasundaram. How do fixes become bugs? In ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, pages 26–36, 2011.
[54]
Jie Yu and Satish Narayanasamy. A case for an interleaving constrained sharedmemory multi-processor. In International Symposium on Computer Architecture, pages 325–336, 2009.
[55]
Tingting Yu, Zunchen Huang, and Chao Wang. ConTesa: Directed test suite augmentation for concurrent software. IEEE Transactions on Software Engineering, 2018.
[56]
Tingting Yu, Witawas Srisa-an, and Gregg Rothermel. SimRT: An automated framework to support regression testing for data races. In International Conference on Software Engineering, pages 48–59, 2014.
[57]
Tingting Yu, Tarannum S. Zaman, and Chao Wang. DESCRY: reproducing system-level concurrency failures. In ACM SIGSOFT Symposium on Foundations of Software Engineering, pages 694–704, 2017.
[58]
Xin Zhang, Ravi Mangal, Radu Grigore, Mayur Naik, and Hongseok Yang. On abstraction refinement for program analyses in datalog. In ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 239–248, 2014.

Cited By

View all
  • (2023)DCLink: Bridging Data Constraint Changes and Implementations in FinTech SystemsProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE56229.2023.00170(914-925)Online publication date: 11-Nov-2023
  • (2019)Specification and inference of trace refinement relationsProceedings of the ACM on Programming Languages10.1145/33606043:OOPSLA(1-30)Online publication date: 10-Oct-2019
  • (2019)Continuously reasoning about programs using differential Bayesian inferenceProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314616(561-575)Online publication date: 8-Jun-2019
  • Show More Cited By

Index Terms

  1. Datalog-based scalable semantic diffing of concurrent programs

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering
    September 2018
    955 pages
    ISBN:9781450359375
    DOI:10.1145/3238147
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 September 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Concurrency
    2. Datalog
    3. atomicity
    4. change impact
    5. race condition
    6. semantic diffing
    7. static analysis

    Qualifiers

    • Research-article

    Conference

    ASE '18
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 82 of 337 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 10 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)DCLink: Bridging Data Constraint Changes and Implementations in FinTech SystemsProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE56229.2023.00170(914-925)Online publication date: 11-Nov-2023
    • (2019)Specification and inference of trace refinement relationsProceedings of the ACM on Programming Languages10.1145/33606043:OOPSLA(1-30)Online publication date: 10-Oct-2019
    • (2019)Continuously reasoning about programs using differential Bayesian inferenceProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314616(561-575)Online publication date: 8-Jun-2019
    • (2019)DebreachProceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE.2019.00088(899-911)Online publication date: 10-Nov-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media