[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3018743.3018768acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

An Efficient Abortable-locking Protocol for Multi-level NUMA Systems

Published: 26 January 2017 Publication History

Abstract

The popularity of Non-Uniform Memory Access (NUMA) architectures has led to numerous locality-preserving hierarchical lock designs, such as HCLH, HMCS, and cohort locks. Locality-preserving locks trade fairness for higher throughput. Hence, some instances of acquisitions can incur long latencies, which may be intolerable for certain applications. Few locks admit a waiting thread to abandon its protocol on a timeout. State-of-the-art abortable locks are not fully locality aware, introduce high overheads, and unsuitable for frequent aborts. Enhancing locality-aware locks with lightweight timeout capability is critical for their adoption. In this paper, we design and evaluate the HMCS-T lock, a Hierarchical MCS (HMCS) lock variant that admits a timeout. HMCS-T maintains the locality benefits of HMCS while ensuring aborts to be lightweight. HMCS-T offers the progress guarantee missing in most abortable queuing locks. Our evaluations show that HMCS-T offers the timeout feature at a moderate overhead over its HMCS analog. HMCS-T, used in an MPI runtime lock, mitigated the poor scalability of an MPI+OpenMP BFS code and resulted in 4.3x superior scaling.

References

[1]
A. Amer, P. Balaji, W. Bland, W. Gropp, R. Latham, H. Lu, L. Oden, A. Pena, K. Raffenetti, S. Seo, T. Rajeev, and Z. Junchao. MPICH User's Guide, Version 3.2. http://www.mpich.org/static/downloads/3.2/mpich-3.2-userguide.pdf, 2015.
[2]
A. Amer, H. Lu, P. Balaji, and S. Matsuoka. Characterizing MPI and Hybrid MPI
[3]
Threads Applications at Scale: Case Study with BFS. In Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on, pages 1075--1083. IEEE, 2015.
[4]
A. Amer, H. Lu, Y. Wei, P. Balaji, and S. Matsuoka. MPI
[5]
Threads: Runtime Contention and Remedies. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, pages 239--248, New York, NY, USA, 2015. ACM.
[6]
A. Amer, H. Lu, Y. Wei, H. Jeff, S. Matsuoka, and P. Balaji. Locking Aspects in Multithreaded MPI Implementations. Technical Report ANL/MCS-P6005-0516, 2016.
[7]
M. Chabbi, M. Fagan, and J. Mellor-Crummey. High Performance Locks for Multi-level NUMA Systems. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 215--226, 2015.
[8]
M. Chabbi and J. Mellor-Crummey. Contention-conscious, Locality-preserving Locks. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '16, pages 22:1--22:14, New York, NY, USA, 2016. ACM.
[9]
T. David, R. Guerraoui, and V. Trigonakis. Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, pages 33--48, New York, NY, USA, 2013. ACM.
[10]
D. Dice, V. J. Marathe, and N. Shavit. Lock Cohorting: A General Technique for Designing NUMA Locks. In Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 247--256, 2012.
[11]
M. Frigo, C. E. Leiserson, and K. H. Randall. The Implementation of the Cilk-5 Multithreaded Language. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, PLDI '98, pages 212--223, New York, NY, USA, 1998. ACM.
[12]
Hewlett Packard Enterprise. HP Integrity Superdome X. http://www8.hp.com/h20195/v2/GetPDF.aspx/c04383189.pdf.
[13]
T. Hoefler, C. Siebert, and A. Lumsdaine. Scalable Communication Protocols for Dynamic Sparse Data Exchange. In Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '10, pages 159--168, New York, NY, USA, 2010. ACM.
[14]
G. J. Holzmann. The Model Checker SPIN. IEEE Transactions on Software Engineering -- Special issue on formal methods in software practice, 23(5):279--295, May 1997.
[15]
Intel Corp. An Introduction to the Intel® QuickPath Interconnect. http://www.intel.com/content/www/us/en/io/quickpath-technology/quick-path-interconnect-introduction-paper.html, 2009.
[16]
P. Jayanti. Adaptive and Efficient Abortable Mutual Exclusion. In Proceedings of the Twenty-second Annual Symposium on Principles of Distributed Computing, PODC '03, pages 295--304, New York, NY, USA, 2003. ACM.
[17]
P. S. Magnusson, A. Landin, and E. Hagersten. Queue Locks on Cache Coherent Multiprocessors. In Proceedings of the 8th International Symposium on Parallel Processing, pages 165--171, 1994.
[18]
V. Marathe, M. Moir, and N. Shavit. Composite Abortable Locks. In Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International, pages 1--10, April 2006.
[19]
J. M. Mellor-Crummey and M. L. Scott. Algorithms for Scalable Synchronization on Shared-memory Multiprocessors. ACM Transactions on Computer Systems, 9(1):21--65, February 1991.
[20]
A. Pareek and P. Woelfel. RMR-efficient Randomized Abortable Mutual Exclusion. In Proceedings of the 26th International Conference on Distributed Computing, DISC'12, pages 267--281, Berlin, Heidelberg, 2012. Springer-Verlag.
[21]
M. L. Scott. Non-blocking Timeout in Scalable Queue-based Spin Locks. In Proceedings of the Twenty-first Annual Symposium on Principles of Distributed Computing, PODC '02, pages 31--40, New York, NY, USA, 2002. ACM.
[22]
M. L. Scott and W. N. Scherer. Scalable Queue-based Spin Locks with Timeout. In Proceedings of the Eighth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, PPoPP '01, pages 44--52, New York, NY, USA, 2001. ACM.
[23]
Scott, Michael. Non-Blocking Timeout in Scalable Queue-Based Spin Locks. https://www.cs.rochester.edu/research/synchronization/pseudocode/nb_timeout.html.
[24]
SGI. SGI UV The World's Most Powerful In-Memory Supercomputers. https://www.sgi.com/products/servers/uv/.
[25]
D. D. Sleator and R. E. Tarjan. Self-adjusting Binary Search Trees. Journal of the ACM, 32(3):652--686, July 1985.
[26]
L. G. Valiant. A Bridging Model for Parallel Computation. Communications of the ACM, 33(8):103--111, Aug. 1990.

Cited By

View all
  • (2023)Protecting Locks Against Unbalanced Unlock()Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591091(199-211)Online publication date: 17-Jun-2023
  • (2020)Efficient Abortable-locking Protocol for Multi-level NUMA SystemsACM Transactions on Parallel Computing10.1145/33997287:3(1-32)Online publication date: 10-Jul-2020
  • (2019)Scalable and practical locking with shufflingProceedings of the 27th ACM Symposium on Operating Systems Principles10.1145/3341301.3359629(586-599)Online publication date: 27-Oct-2019
  • Show More Cited By

Index Terms

  1. An Efficient Abortable-locking Protocol for Multi-level NUMA Systems

                        Recommendations

                        Comments

                        Please enable JavaScript to view thecomments powered by Disqus.

                        Information & Contributors

                        Information

                        Published In

                        cover image ACM Conferences
                        PPoPP '17: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
                        January 2017
                        476 pages
                        ISBN:9781450344937
                        DOI:10.1145/3018743
                        Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

                        Sponsors

                        Publisher

                        Association for Computing Machinery

                        New York, NY, United States

                        Publication History

                        Published: 26 January 2017

                        Permissions

                        Request permissions for this article.

                        Check for updates

                        Author Tags

                        1. abortable lock
                        2. hierarchical lock
                        3. mcs lock
                        4. numa
                        5. queuing lock
                        6. spin lock
                        7. synchronization
                        8. timeout lock

                        Qualifiers

                        • Research-article

                        Funding Sources

                        • .S. Department of Energy, Office of Science
                        • Center for Selective C-H Functionalization, National Science Foundation

                        Conference

                        PPoPP '17
                        Sponsor:

                        Contributors

                        Other Metrics

                        Bibliometrics & Citations

                        Bibliometrics

                        Article Metrics

                        • Downloads (Last 12 months)12
                        • Downloads (Last 6 weeks)1
                        Reflects downloads up to 16 Jan 2025

                        Other Metrics

                        Citations

                        Cited By

                        View all
                        • (2023)Protecting Locks Against Unbalanced Unlock()Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591091(199-211)Online publication date: 17-Jun-2023
                        • (2020)Efficient Abortable-locking Protocol for Multi-level NUMA SystemsACM Transactions on Parallel Computing10.1145/33997287:3(1-32)Online publication date: 10-Jul-2020
                        • (2019)Scalable and practical locking with shufflingProceedings of the 27th ACM Symposium on Operating Systems Principles10.1145/3341301.3359629(586-599)Online publication date: 27-Oct-2019
                        • (2019)Lock Contention Management in Multithreaded MPIACM Transactions on Parallel Computing10.1145/32754435:3(1-21)Online publication date: 8-Jan-2019
                        • (2018)Deterministic Abortable Mutual Exclusion with Sublogarithmic Adaptive RMR ComplexityProceedings of the 2018 ACM Symposium on Principles of Distributed Computing10.1145/3212734.3212759(27-36)Online publication date: 23-Jul-2018
                        • (2018)Featherlight on-the-fly false-sharing detectionACM SIGPLAN Notices10.1145/3200691.317849953:1(152-167)Online publication date: 10-Feb-2018
                        • (2018)Featherlight on-the-fly false-sharing detectionProceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3178487.3178499(152-167)Online publication date: 10-Feb-2018
                        • (2021)FTSDProceedings of the 12th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3476886.3477518(123-130)Online publication date: 24-Aug-2021
                        • (2021)Verifying and Optimizing the HMCS Lock for Arm ServersNetworked Systems10.1007/978-3-030-91014-3_17(240-260)Online publication date: 19-May-2021
                        • (2020)An HTM-based update-side synchronization for RCU on NUMA systemsProceedings of the Fifteenth European Conference on Computer Systems10.1145/3342195.3387527(1-15)Online publication date: 15-Apr-2020

                        View Options

                        Login options

                        View options

                        PDF

                        View or Download as a PDF file.

                        PDF

                        eReader

                        View online with eReader.

                        eReader

                        Media

                        Figures

                        Other

                        Tables

                        Share

                        Share

                        Share this Publication link

                        Share on social media