[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Kendo: efficient deterministic multithreading in software

Published: 07 March 2009 Publication History

Abstract

Although chip-multiprocessors have become the industry standard, developing parallel applications that target them remains a daunting task. Non-determinism, inherent in threaded applications, causes significant challenges for parallel programmers by hindering their ability to create parallel applications with repeatable results. As a consequence, parallel applications are significantly harder to debug, test, and maintain than sequential programs.
This paper introduces Kendo: a new software-only system that provides deterministic multithreading of parallel applications. Kendo enforces a deterministic interleaving of lock acquisitions and specially declared non-protected reads through a novel dynamically load-balanced deterministic scheduling algorithm. The algorithm tracks the progress of each thread using performance counters to construct a deterministic logical time that is used to compute an interleaving of shared data accesses that is both deterministic and provides good load balancing. Kendo can run on today's commodity hardware while incurring only a modest performance cost. Experimental results on the SPLASH-2 applications yield a geometric mean overhead of only 16% when running on 4 processors. This low overhead makes it possible to benefit from Kendo even after an application is deployed. Programmers can start using Kendo today to program parallel applications that are easier to develop, debug, and test.

References

[1]
Claudio Basile, Keith Whisnant, Zbigniew Kalbarczyk, and Ravi Iyer. Loose synchronization of multithreaded replicas. pages 250--255, 2002.
[2]
R. Bocchino, V. Adve, S. Adve, and M. Snir. Parallel programming must be deterministic by default. Technical Report UIUCDCS-R-2008-3012, University of Illinois at Urbana-Champaign, 2008. URL http://dpj.cs.uiuc.edu/DPJ/Publications_files/paper.pdf.
[3]
Guang-Ien Cheng, Mingdong Feng, Charles E. Leiserson, Keith H. Randall, and Andrew F. Stark. Detecting data races in cilk programs that use locks. In proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA'98), pages 298--309, Puerto Vallarta, Mexico, June 28-July 2, 1998.
[4]
Joe Devietti, Brandon Lucia, Luis Ceze, and Mark Oskin. Explicitly parallel programming with shared-memory is insane: At least make it deterministic! In proceedings of SHCMP 2008: Workshop on Software and Hardware Challenges of Manycore Platforms, Beijing, China, June 22, 2008.
[5]
Rachna Dhamija and Adrian Perrig. Deja vu: a user study using images for authentication. In SSYM'00: Proceedings of the 9th conference on USENIX Security Symposium, pages 4--4, Berkeley, CA, USA, 2000. USENIX Association.
[6]
Jorg Domaschka, Franz J. Hauck, Hans P. Reiser, and Rutiger Kapitza. Deterministic multithreading for java-based replicated objects. In proceedings of the International Conference on Parallel and Distributed Computing and Systems, 2006.
[7]
Jorg Domaschka, Andreas I. Schmied, Hans P. Reiser, and Franz J. Hauck. Revisiting deterministic multithreading strategies. International Parallel and Distributed Processing Symposium, pages 1--8, 2007.
[8]
George W. Dunlap, Dominic G. Lucchetti, Michael A. Fetterman, and Peter M. Chen. Execution replay of multiprocessor virtual machines. In VEE'08: Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, pages 121--130, New York, NY, USA, 2008. ACM. ISBN 978-1-59593-796-4.
[9]
Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. The implementation of the Cilk-5 multithreaded language. In proceedings of the ACM SIGPLAN-98 Conference on Programming Language Design and Implementation, pages 212--223, Montreal, Quebec, Canada, June 1998. Proceedings published ACMSIGPLAN Notices, Vol. 33, No. 5, May, 1998.
[10]
Derek R. Hower and Mark D. Hill. Rerun: Exploiting episodes for lightweight memory race recording. In proceedings of the 35th Annual International Symposium on Computer Architecture (ISCA), June 2008.
[11]
Wolfgang Karl, Markus Leberecht, and Michael Oberhuber. Forcing deterministic execution of parallel programs -- debugging support through the smile monitoring approach. In proceedings of the SCI-Europe, September 1998.
[12]
Milind Kulkarni, Keshav Pingali, Ganesh Ramanarayanan, Bruce Walter, Kavita Bala, and L. Paul Chew. Optimistic parallelism benefits from data partitioning. SIGARCH Comput. Archit. News, 36(1):233--243, 2008. ISSN 0163-5964.
[13]
Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. Commun. ACM, 21(7):558--565, 1978. ISSN 0001-0782.
[14]
T. J. LeBlanc and J. M. Mellor-Crummey. Debugging parallel programs with instant replay. IEEE Trans. Comput., 36(4):471--482, 1987. ISSN 0018-9340.
[15]
Edward A. Lee. The problem with threads. Computer, 39(5):33--42, May 2006. ISSN 0018-9162.
[16]
Shan Lu, Joe Tucek, Feng Qin, and Yuanyuan Zhou. Avio: Detecting atomicity violations via access-interleaving invariants. In proceedings of the International Conference on Architecture Support for Programming Languages and Operating Systems, October 2006.
[17]
Shan Lu, Weihang Jiang, and Yuanyuan Zhou. A study of interleaving coverage criteria. In proceedings of ESEC-FSE, pages 533--536, New York, NY, USA, 2007a. ACM. ISBN 978-1-59593-811-4.
[18]
Shan Lu, Soyeon Park, Chongfeng Hu, Xiao Ma, Weihang Jiang, Zhenmin Li, Raluca A. Popa, and Yuanyuan Zhou. Muvi: Automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs. In proceedings of the 21st ACM Symposium on Operating Systems Principles, October 2007b.
[19]
Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. In ASPLOS XIII: proceedings of the 13th international conference on Architectural support for programming languages and operating systems, pages 329--339, New York, NY, USA, 2008. ACM. ISBN 978-1-59593-958-6.
[20]
Pablo Montesinos, Luis Ceze, and Josep Torrellas. Delorean: Recording and deterministically replaying shared-memory multiprocessor execution efficiently. In proceedings of the 35th Annual International Symposium on Computer Architecture (ISCA), June 2008.
[21]
Hans P. Reiser, Jorg Domaschka, Franz J. Hauck, Rudiger Kapitza, and Wolfgang Schroder-Preikschat. Consistent replication of multithreaded distributed objects. 25th IEEE Symposium on Reliable Distributed Systems, pages 257--266, 2006. ISSN 1060-9857.
[22]
Jonathan Rose. Locusroute: a parallel global router for standard cells. In DAC-88: Proceedings of the 25th ACM/IEEE conference on Design automation, pages 189--195, Los Alamitos, CA, USA, 1988. IEEE Computer Society Press. ISBN 0-8186-8864-5.
[23]
Mark Russinovich and Bryce Cogswell. Replay for concurrent non-deterministic shared-memory applications. In proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation, pages 258--266, New York, NY, USA, 1996. ACM. ISBN 0-89791-795-2.
[24]
Debashis Saha and Sourav K. Dutta. Specification of deterministic execution timing schema for parallel programs on a multiprocessor. Computer, Communication, Control and Power Engineering, pages 114--116 vol.1, 1993.
[25]
Jaswinder Pal Singh, Anoop Gupta, and Marc Levoy. Parallel visualization algorithms: Performance and architectural implications. Computer, 27(7):45--55, 1994. ISSN 0018-9162.
[26]
William Thies, Michal Karczmarek, and Saman Amarasinghe. Streamit: A language for streaming applications. In International Conference on Compiler Construction, Grenoble, France, April 2002.
[27]
Joseph Tucek, Shan Lu, Chengdu Huang, Spiros Xanthos, and Yuanyuan Zhou. Triage: diagnosing production run failures at the user's site. In proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles, pages 131--144, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-591-5.
[28]
Larry D. Wittie. Debugging distributed C programs by real time reply. SIGPLAN Not., 24(1):57--67, 1989. ISSN 0362-1340.
[29]
S.C. Woo, M. Ohara, E. Torrie, J.P. Singh, and A. Gupta. The SPLASH-2 programs: characterization and methodological considerations. In proceedings of 22nd Annual International Symposium on Computer Architecture News, pages 24--36, June 1995.
[30]
Min Xu, Rastislav Bodik, and Mark D. Hill. A "flight data recorder" for enabling full-system multiprocessor deterministic replay. SIGARCH Comput. Archit. News, 31(2):122--135, 2003. ISSN 0163-5964.

Cited By

View all
  • (2024)Ensuring State Continuity for Confidential Computing: A Blockchain-Based ApproachIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.338197321:6(5635-5649)Online publication date: 1-Nov-2024
  • (2022)NARRATORProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security10.1145/3548606.3560620(2385-2399)Online publication date: 7-Nov-2022
  • (2022)The usage of cybernetic in complex software systems and its application to the deterministic multithreadingConcurrency and Computation: Practice and Experience10.1002/cpe.737534:28Online publication date: 31-Oct-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 37, Issue 1
ASPLOS 2009
March 2009
346 pages
ISSN:0163-5964
DOI:10.1145/2528521
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS XIV: Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
    March 2009
    358 pages
    ISBN:9781605584065
    DOI:10.1145/1508244
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 March 2009
Published in SIGARCH Volume 37, Issue 1

Check for updates

Author Tags

  1. debugging
  2. determinism
  3. deterministic multithreading
  4. multicore
  5. parallel programming

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)45
  • Downloads (Last 6 weeks)3
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Ensuring State Continuity for Confidential Computing: A Blockchain-Based ApproachIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.338197321:6(5635-5649)Online publication date: 1-Nov-2024
  • (2022)NARRATORProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security10.1145/3548606.3560620(2385-2399)Online publication date: 7-Nov-2022
  • (2022)The usage of cybernetic in complex software systems and its application to the deterministic multithreadingConcurrency and Computation: Practice and Experience10.1002/cpe.737534:28Online publication date: 31-Oct-2022
  • (2021)Postmortem accurate IR-level state recovery for deployed concurrent programsACM SIGAPP Applied Computing Review10.1145/3493499.349350221:3(33-48)Online publication date: 20-Oct-2021
  • (2020)TimelyRep: Timing deterministic replay for Android web applicationsSoftware Testing, Verification and Reliability10.1002/stvr.174530:4-5Online publication date: 6-Jul-2020
  • (2019)Extracting Safe Thread Schedules from Incomplete Model Checking ResultsModel Checking Software10.1007/978-3-030-30923-7_9(153-171)Online publication date: 15-Jul-2019
  • (2018)REPTProceedings of the 13th USENIX conference on Operating Systems Design and Implementation10.5555/3291168.3291171(17-32)Online publication date: 8-Oct-2018
  • (2018)Early Scheduling in Parallel State Machine ReplicationProceedings of the ACM Symposium on Cloud Computing10.1145/3267809.3267825(82-94)Online publication date: 11-Oct-2018
  • (2017)Lazy Diagnosis of In-Production Concurrency BugsProceedings of the 26th Symposium on Operating Systems Principles10.1145/3132747.3132767(582-598)Online publication date: 14-Oct-2017
  • (2016)Using DEv-PROMELA for Modelling and Verification of SoftwareProceedings of the 2016 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/2901378.2901388(245-253)Online publication date: 15-May-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media