[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3236367.3236369acmotherconferencesArticle/Chapter ViewAbstractPublication PageseurompiConference Proceedingsconference-collections
research-article

MC-CChecker: A Clock-Based Approach to Detect Memory Consistency Errors in MPI One-Sided Applications

Published: 23 September 2018 Publication History

Abstract

MPI one-sided communication decouples data movement from synchronization, which eliminates overhead from unneeded synchronization and allows for greater concurrency. On the one hand this fact is the great advantage of MPI one-sided communication, but on the other, it poses enormous challenges for programmers in preserving the reliability of programs. Memory consistency errors are notorious for degrading reliability as well as performance of MPI one-sided applications. Even an MPI expert can easily make these mistakes. The lockopts bug occurred in an RMA test case that is part of MPICH MPI implementation is an example for this situation. Hence, detecting memory consistency errors is extremely challenging. MC-Checker is the most cutting-edge debugger to address these errors effectively. MC-Checker tackles the memory consistency errors based on the happened-before relation. Taking full advantage of the relation makes DN-Analyzer of MC-Checker difficult to scale well. For that reason, MC-Checker does ignore the transitive ordering of the happened-before relation to retain scalability of DN-Analyzer. Consequently, MC-Checker is highly able to impose a potential source of false positives.
In order to overcome this issue, we present a novel clock-based approach called MC-CChecker with the aim of fully preserving the happened-before relation by making use of an encoded vector clock. MC-CChecker inherits distinguishing features from MC-Checker by reusing ST-Analyzer and Profiler while focusing mainly on the optimization of DN-Analyzer. The experimental findings prove that MC-CChecker not only effectively detects memory consistency errors as MC-Checker did, but also completely eliminates the potential source of false positives which is a major limitation of MC-Checker while still retaining acceptable overheads of execution time and memory usage for DN-Analyzer. Especially, DN-Analyzer of MC-CChecker is fairly scalable when processing a large amount of trace files generated from running the lockopts up to 8192 processes.

References

[1]
2018. The GNU MP Bignum Library. Retrieved May 5, 2018 from https: //www. gmplib.org
[2]
2018. Leibniz Super computing Centre, Munich, Germany: SuperMUC Petascale System. Retrieved May 5, 2018 from https://www.lrz.de/services/compute/supermuc/systemdescription
[3]
2018. MPICH2: A high-performance and widely portable implementation of the Message Passing Interface (MPI) standard. Retrieved May 5, 2018 from http://www.mpich.org
[4]
2018. Ohio Supercomputer Center. Retrieved May 5, 2018 from http://www.osc.edu
[5]
2018. SuperNode-XP. Retrieved May 5, 2018 from https://www.hpcc.hcmut.edu.vn
[6]
2018. TOP500. Retrieved May 5, 2018 from https://www.top500.org
[7]
Roberto Baldoni and Giovanna Melideo. 2003. k-dependency vectors: A scalable causality-tracking protocol. In Parallel, Distributed and Network-Based Processing, 2003. Proceedings. Eleventh Euromicro Conference on. IEEE, 219--226.
[8]
Franck Cappello, Al Geist, William Gropp, Sanjay Kale, Bill Kramer, and Marc Snir. 2014. Toward exascale resilience: 2014 update. Supercomputing frontiers and innovations 1, 1 (2014), 5--28.
[9]
Zhezhe Chen, James Dinan, Zhen Tang, Pavan Balaji, Hua Zhong, Jun Wei, Tao Huang, and Feng Qin. 2014. Mc-checker: Detecting memory consistency errors in mpi one-sided applications. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Press, 499--510.
[10]
Yifeng Cui, Kim B Olsen, Thomas H Jordan, Kwangyoon Lee, Jun Zhou, Patrick Small, Daniel Roten, Geoffrey Ely, Dhabaleswar K Panda, Amit Chourasia, et al. 2010. Scalable earthquake simulation on petascale supercomputers. In High Performance Computing, Networking, Storage and Analysis (SC), 2010 International Conference for. IEEE, 1--20.
[11]
James Dinan, Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, and Rajeev Thakur. 2016. An implementation and evaluation of the MPI 3.0 one-sided communication interface. Concurrency and Computation: Practice and Experience 28, 17(2016), 4385--4404.
[12]
Colin J Fidge. 1987. Timestamps in message-passing systems that preserve the partial ordering. (1987).
[13]
Anders Gidenstam and Marina Papatriantafilou. 2004. Adaptive plausible clocks. In Distributed Computing Systems, 2004. Proceedings. 24th International Conference on. IEEE, 86--93.
[14]
William Gropp and Marc Snir. 2013. Programming for exascale computers. Computing in Science & Engineering 15, 6 (2013), 27--35.
[15]
J-M Hélary, Michel Raynal, Giovanna Melideo, and Roberto Baldoni. 2003. Efficient causality-tracking timestamping. IEEE Transactions on Knowledge and Data Engineering 15, 5 (2003), 1239--1250.
[16]
Marc-André Hermanns, Manfred Miklosch, David Böhme, and Felix Wolf. 2013. Understanding the formation of wait states in applications with one-sided communication. In Proceedings of the 20th European MPI Users' Group Meeting. ACM, 73--78.
[17]
Torsten Hoefler, James Dinan, Rajeev Thakur, Brian Barrett, Pavan Balaji, William Gropp, and Keith Underwood. 2015. Remote memory access programming in MPI-3. ACM Transactions on Parallel Computing 2, 2 (2015), 9.
[18]
Roger Kowalewski and Karl Fürlinger. 2016. Nasty-MPI: Debugging Synchronization Errors in MPI-3 One-Sided Applications. In European Conference on Parallel Processing. Springer, 51--62.
[19]
Roger Kowalewski and Karl Fürlinger. 2017. Debugging Latent Synchronization Errors in MPI-3 One-Sided Communication., 83--96 pages.
[20]
Bettina Krammer and Michael M Resch. 2006. Correctness checking of MPI one-sided communication using Marmot. In European Parallel Virtual Machine/Message Passing Interface UsersâĂŹ Group Meeting. Springer, 105--114.
[21]
Dieter Kranzlmüller. 2000. Event graph analysis for debugging massively parallel programs. na.
[22]
Ajay D Kshemkalyani, Ashfaq Khokhar, and Min Shen. 2018. Encoded Vector Clock: Using Primes to Characterize Causality in Distributed Systems. In Proceedings of the 19th International Conference on Distributed Computing and Networking. ACM, 12.
[23]
Leslie Lamport. 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (1978), 558--565.
[24]
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization. IEEE Computer Society, 75.
[25]
Glenn R Luecke, Silvia Spanoyannis, and Marina Kraeva. 2004. The performance and scalability of SHMEM and MPI-2 one-sided routines on a SGI Origin 2000 and a Cray T3E-600. Concurrency and Computation: Practice and Experience 16, 10 (2004), 1037--1060.
[26]
Friedemann Mattern et al. 1989. Virtual time and global states of distributed systems. Parallel and Distributed Algorithms 1, 23 (1989), 215--226.
[27]
Message Passing Interface Forum 2015. MPI: A Message-Passing Interface Standard (3.1 ed.). Message Passing Interface Forum.
[28]
Chris Oehmen and Jarek Nieplocha. 2006. ScalaBLAST: A scalable implementation of BLAST for high-performance data-intensive bioinformatics analysis. IEEE Transactions on Parallel and Distributed Systems 17, 8 (2006), 740--749.
[29]
Mi-Young Park and Sang-Hwa Chung. 2009. Detecting Race Conditions in One-Sided Communication of MPI Programs. In Computer and Information Science, 2009. ICIS 2009. Eighth IEEE/ACIS International Conference on. IEEE, 867--872.
[30]
Salman Pervez, Ganesh Gopalakrishnan, Robert M Kirby, Rajeev Thakur, and William Gropp. 2006. Formal verification of programs that use MPI one-sided communication. In European Parallel Virtual Machine/Message Passing Interface UsersâĂŹ Group Meeting. Springer, 30--39.
[31]
RJ Thacker, Gavin Pringle, HMP Couchman, and Stephen Booth. 2003. Hydra-mpi: An adaptive particle-particle, particle-mesh code for conducting cosmological simulations on mpp architectures. In High Performance Computing Systems and Applications. NRC Research Press, 23.
[32]
Francisco J Torres-Rojas. 2001. Performance evaluation of plausible clocks. In European Conference on Parallel Processing. Springer, 476--481.
[33]
Francisco J Torres-Rojas and Mustaque Ahamad. 1996. Plausible clocks: Constant size logical clocks for distributed systems. In International Workshop on Distributed Algorithms. Springer, 71--88.
[34]
Marat Valiev, Eric J Bylaska, Niranjan Govind, Karol Kowalski, Tjerk P Straatsma, Hubertus JJ Van Dam, Dunyou Wang, Jarek Nieplocha, Edoardo Apra, Theresa L Windus, et al. 2010. NWChem: a comprehensive and scalable open-source solution for large scale molecular simulations. Computer Physics Communications 181, 9 (2010), 1477--1489.

Cited By

View all
  • (2024)RMASanitizer: Generalized Runtime Detection of Data Races in Remote Memory Access ApplicationsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673109(833-844)Online publication date: 12-Aug-2024
  • (2024)MPI Errors Detection using GNN Embedding and Vector Embedding over LLVM IR2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00059(595-607)Online publication date: 27-May-2024
  • (2024)Static-Dynamic Analysis for Performance and Accuracy of Data Race Detection in MPI One-Sided ProgramsHigh Performance Computing. ISC High Performance 2024 International Workshops10.1007/978-3-031-73716-9_5(59-73)Online publication date: 14-Dec-2024
  • Show More Cited By

Index Terms

  1. MC-CChecker: A Clock-Based Approach to Detect Memory Consistency Errors in MPI One-Sided Applications

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      EuroMPI '18: Proceedings of the 25th European MPI Users' Group Meeting
      September 2018
      187 pages
      ISBN:9781450364928
      DOI:10.1145/3236367
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 23 September 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Encoded Vector Clock
      2. MC-Checker
      3. MPI
      4. Memory Consistency Error
      5. One-Sided Communication

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      EuroMPI'18
      EuroMPI'18: 25th European MPI Users' Group Meeting
      September 23 - 26, 2018
      Barcelona, Spain

      Acceptance Rates

      Overall Acceptance Rate 66 of 139 submissions, 47%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)15
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 11 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)RMASanitizer: Generalized Runtime Detection of Data Races in Remote Memory Access ApplicationsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673109(833-844)Online publication date: 12-Aug-2024
      • (2024)MPI Errors Detection using GNN Embedding and Vector Embedding over LLVM IR2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00059(595-607)Online publication date: 27-May-2024
      • (2024)Static-Dynamic Analysis for Performance and Accuracy of Data Race Detection in MPI One-Sided ProgramsHigh Performance Computing. ISC High Performance 2024 International Workshops10.1007/978-3-031-73716-9_5(59-73)Online publication date: 14-Dec-2024
      • (2023)RMARaceBench: A Microbenchmark Suite to Evaluate Race Detection Tools for RMA ProgramsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624087(205-214)Online publication date: 12-Nov-2023
      • (2023)Rethinking Data Race Detection in MPI-RMA ProgramsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624086(196-204)Online publication date: 12-Nov-2023
      • (2023)Leveraging Static Analysis to Accelerate Dynamic Race Detection for Remote Memory Access ProgramsHigh Performance Computing. ISC High Performance 2024 International Workshops10.1007/978-3-031-73716-9_4(45-58)Online publication date: 12-May-2023
      • (2022)On-the-Fly Data Race Detection for MPI RMA Programs with MUST2022 IEEE/ACM Sixth International Workshop on Software Correctness for HPC Applications (Correctness)10.1109/Correctness56720.2022.00009(27-36)Online publication date: Nov-2022
      • (2022)Static Local Concurrency Errors Detection in MPI-RMA Programs2022 IEEE/ACM Sixth International Workshop on Software Correctness for HPC Applications (Correctness)10.1109/Correctness56720.2022.00008(18-26)Online publication date: Nov-2022
      • (2020)Resettable Encoded Vector Clock for Causality Analysis With an Application to Dynamic Race DetectionIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.303229332:4(772-785)Online publication date: 18-Nov-2020
      • (2018)On the Growth of the Prime Numbers Based Encoded Vector ClockDistributed Computing and Internet Technology10.1007/978-3-030-05366-6_14(169-184)Online publication date: 11-Dec-2018

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media