[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Simplifying ARM concurrency: multicopy-atomic axiomatic and operational models for ARMv8

Published: 27 December 2017 Publication History

Abstract

ARM has a relaxed memory model, previously specified in informal prose for ARMv7 and ARMv8. Over time, and partly due to work building formal semantics for ARM concurrency, it has become clear that some of the complexity of the model is not justified by the potential benefits. In particular, the model was originally non-multicopy-atomic: writes could become visible to some other threads before becoming visible to all — but this has not been exploited in production implementations, the corresponding potential hardware optimisations are thought to have insufficient benefits in the ARM context, and it gives rise to subtle complications when combined with other ARMv8 features. The ARMv8 architecture has therefore been revised: it now has a multicopy-atomic model. It has also been simplified in other respects, including more straightforward notions of dependency, and the architecture now includes a formal concurrency model.
In this paper we detail these changes and discuss their motivation. We define two formal concurrency models: an operational one, simplifying the Flowing model of Flur et al., and the axiomatic model of the revised ARMv8 specification. The models were developed by an academic group and by ARM staff, respectively, and this extended collaboration partly motivated the above changes. We prove the equivalence of the two models. The operational model is integrated into an executable exploration tool with new web interface, demonstrated by exhaustively checking the possible behaviours of a loop-unrolled version of a Linux kernel lock implementation, a previously known bug due to unprevented speculation, and a fixed version.

Supplementary Material

WEBM File (simplifyingarm.webm)

References

[1]
Allon Adir, Hagit Attiya, and Gil Shurek. 2003. Information-Flow Models for Shared Memory with an Application to the PowerPC Architecture. IEEE Trans. Parallel Distrib. Syst. 14, 5 (2003), 502–515.
[2]
Mustaque Ahamad, Gil Neiger, James E. Burns, Prince Kohli, and Phillip W. Hutto. 1995. Causal memory: definitions, implementation, and programming. Distributed Computing 9, 1 (1995), 37–49.
[3]
Jade Alglave, Anthony Fox, Samin Ishtiaq, Magnus O. Myreen, Susmit Sarkar, Peter Sewell, and Francesco Zappa Nardelli. 2009. The Semantics of Power and ARM Multiprocessor Machine Code. In Proc. DAMP 2009.
[4]
Jade Alglave and Luc Maranget. 2017. http://diy.inria.fr/doc/index.html . (April 2017).
[5]
Jade Alglave, Luc Maranget, Susmit Sarkar, and Peter Sewell. 2010. Fences in Weak Memory Models. In Proc. CAV.
[6]
Jade Alglave, Luc Maranget, Susmit Sarkar, and Peter Sewell. 2011. Litmus: running tests against hardware. In Proceedings of TACAS 2011: the 17th international conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer-Verlag, Berlin, Heidelberg, 41–44. http://dl.acm.org/citation.cfm?id=1987389.1987395
[7]
Jade Alglave, Luc Maranget, and Michael Tautschnig. 2014. Herding Cats: Modelling, Simulation, Testing, and Data Mining for Weak Memory. ACM TOPLAS 36, 2, Article 7 (July 2014), 74 pages.
[8]
ARM Ltd. 2016. ARM Architecture Reference Manual (ARMv8, for ARMv8-A architecture profile). ARM Ltd. ARM DDI 0487A.k_iss10775 (ID092916).
[9]
ARM Ltd. 2017. ARM Architecture Reference Manual (ARMv8, for ARMv8-A architecture profile). ARM Ltd. ARM DDI 0487B.a (ID033117).
[10]
ARM Ltd. 2017. ARM Processor Cortex-A53 MPCore Product Revision r0 Software Developers Errata Notice. http://infocenter.arm.com/help/topic/com.arm.doc.epm048406/Cortex_A53_MPCore_Software_Developers_Errata_ Notice.pdf . (May 2017).
[11]
Arvind Arvind and Jan-Willem Maessen. 2006. Memory Model = Instruction Reordering + Store Atomicity. SIGARCH Comput. Archit. News 34, 2 (May 2006), 29–40.
[12]
Mark Batty, Kayvan Memarian, Kyndylan Nienhuis, Jean Pichon-Pharabod, and Peter Sewell. 2015. The Problem of Programming Language Concurrency Semantics. In Programming Languages and Systems - 24th European Symposium on Programming, ESOP 2015, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2015, London, UK, April 11-18, 2015. Proceedings. 283–307.
[13]
Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. 2011. Mathematizing C++ Concurrency. In Proc. POPL.
[14]
Pete Becker (Ed.). 2011. Programming Languages — C++. ISO/IEC 14882:2011. http://www.open- std.org/jtc1/sc22/wg21/ docs/papers/2011/n3242.pdf .
[15]
James Bornholt and Emina Torlak. 2017. Synthesizing Memory Models from Framework Sketches and Litmus Tests. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA, 467–481.
[16]
Sebastian Burckhardt and Madanlal Musuvathi. 2008. Effective Program Verification for Relaxed Memory Models. Springer Berlin Heidelberg, Berlin, Heidelberg, 107–120.
[17]
Nathan Chong and Samin Ishtiaq. 2008. Reasoning about the ARM weakly consistent memory model. In MSPC.
[18]
William W. Collier. 1992. Reasoning about parallel architectures. Prentice Hall, Englewood Cliffs. http://opac.inria.fr/record= b1105256
[19]
F. Corella, J. M. Stone, and C. M. Barton. 1993. A formal specification of the PowerPC shared memory architecture. Technical Report RC18638. IBM.
[20]
Mike Daines. 2017. Viz.js, a hack to put Graphviz on the web. https://github.com/mdaines/viz.js/ . (2017).
[21]
Will Deacon. 2015. Linux commit ‘arm64: spinlock: serialise spin_unlock_wait against concurrent lockers’. https://git.kernel. org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d86b8da04dfa . (2015).
[22]
Will Deacon. 2016. The ARMv8 Application Level Memory Model. https://github.com/herd/herdtools7/blob/master/herd/ libdir/aarch64.cat . (2016).
[23]
Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell. 2016a. The Flowing and POP Models (supplementary material for Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA). (2016). http://www.cl.cam.ac.uk/~sf502/popl16/model_full.pdf .
[24]
Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell. 2016b. Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA. In Proceedings of POPL: the 43rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages.
[25]
Shaked Flur, Susmit Sarkar, Christopher Pulte, Kyndylan Nienhuis, Luc Maranget, Kathryn E. Gray, Ali Sezgin, Mark Batty, and Peter Sewell. 2017. Mixed-size Concurrency: ARM, POWER, C/C++11, and SC. In POPL 2017: The 44th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Paris, France.
[26]
Emden R. Gansner and Stephen C. North. 2000. An open graph visualization system and its applications to software engineering. Software: Practice and Experience 30, 11 (2000), 1203–1233. http://www.graphviz.org/ .
[27]
Kathryn E. Gray, Gabriel Kerneis, Dominic Mulligan, Christopher Pulte, Susmit Sarkar, and Peter Sewell. 2015. An integrated concurrency and core-ISA architectural envelope definition, and test oracle, for IBM POWER multiprocessors. In Proc. MICRO-48, the 48th Annual IEEE/ACM International Symposium on Microarchitecture.
[28]
Lisa Higham, Jalal Kawash, and Nathaly Verwaal. 1998. Weak Memory Consistency Models. Part I: Definitions and Comparisons. Technical Report. Department of Computer Science, The University of Calgary.
[29]
David Howells, Paul E. McKenney, Will Deacon, and Peter Zijlstra. 2016. Documentation/memory-barriers.txt. https: //www.kernel.org/doc/Documentation/memory- barriers.txt . (2016).
[30]
Jeehoon Kang, Chung-Kil Hur, Ori Lahav, Viktor Vafeiadis, and Derek Dreyer. 2017. A Promising Semantics for Relaxedmemory Concurrency. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2017). ACM, New York, NY, USA, 175–189.
[31]
Linux contributors. 2014. Documentation/locking/spinlocks.txt. https://www.kernel.org/doc/Documentation/locking/ spinlocks.txt . (2014).
[32]
Daniel Lustig, Andrew Wright, Alexandros Papakonstantinou, and Olivier Giroux. 2017. Automated Synthesis of Comprehensive Memory Model Litmus Test Suites. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’17). ACM, New York, NY, USA, 661–675.
[33]
Sela Mador-Haim, Rajeev Alur, and Milo M. K. Martin. 2010. Generating Litmus Tests for Contrasting Memory Consistency Models. In Computer Aided Verification, 22nd International Conference, CAV 2010, Edinburgh, UK, July 15-19, 2010. Proceedings. 273–287.
[34]
Sela Mador-Haim, Luc Maranget, Susmit Sarkar, Kayvan Memarian, Jade Alglave, Scott Owens, Rajeev Alur, Milo M. K. Martin, Peter Sewell, and Derek Williams. 2012. An Axiomatic Memory Model for POWER Multiprocessors. In Proceedings of CAV 2012: the 24th International Conference on Computer Aided Verification. 495–512.
[35]
Yatin A. Manerkar, Daniel Lustig, Michael Pellauer, and Margaret Martonosi. 2015. CCICheck: Using µHb Graphs to Verify the Coherence-consistency Interface. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO-48). ACM, New York, NY, USA, 26–37.
[36]
Luc Maranget, Susmit Sarkar, and Peter Sewell. 2012. A Tutorial Introduction to the ARM and POWER Relaxed Memory Models. Draft available from http://www.cl.cam.ac.uk/~pes20/ppc- supplemental/test7.pdf . (2012).
[37]
Paul E McKenney. 2017. Remove spin_unlock_wait(). (Jun 2017). https://lkml.org/lkml/2017/6/29/967
[38]
Dominic P. Mulligan, Scott Owens, Kathryn E. Gray, Tom Ridge, and Peter Sewell. 2014. Lem: reusable engineering of realworld semantics. In Proceedings of ICFP 2014: the 19th ACM SIGPLAN International Conference on Functional Programming. 175–188.
[39]
Scott Owens, Susmit Sarkar, and Peter Sewell. 2009. A better x86 memory model: x86-TSO. In Proceedings of TPHOLs 2009: Theorem Proving in Higher Order Logics, LNCS 5674. 391–407.
[40]
Jean Pichon-Pharabod and Peter Sewell. 2016. A concurrency semantics for relaxed atomics that permits optimisation and avoids thin-air executions. In Proceedings of POPL.
[41]
Susmit Sarkar, Kayvan Memarian, Scott Owens, Mark Batty, Peter Sewell, Luc Maranget, Jade Alglave, and Derek Williams. 2012. Synchronising C/C++ and POWER. In Proceedings of PLDI 2012, the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation (Beijing). 311–322.
[42]
Susmit Sarkar, Peter Sewell, Jade Alglave, Luc Maranget, and Derek Williams. 2011. Understanding POWER Multiprocessors. In Proceedings of PLDI 2011: the 32nd ACM SIGPLAN conference on Programming Language Design and Implementation. 175–186.
[43]
Susmit Sarkar, Peter Sewell, Francesco Zappa Nardelli, Scott Owens, Tom Ridge, Thomas Braibant, Magnus Myreen, and Jade Alglave. 2009. The Semantics of x86-CC Multiprocessor Machine Code. In Proceedings of POPL 2009: the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages. 379–391.
[44]
Jérôme Vouillon and Vincent Balat. 2014. From bytecode to JavaScript: the Js_of_ocaml compiler. Software: Practice and Experience 44, 8 (2014), 951–972.
[45]
John Wickerson, Mark Batty, Tyler Sorensen, and George A. Constantinides. 2017. Automatically Comparing Memory Consistency Models. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2017). ACM, New York, NY, USA, 190–204.
[46]
Alon Zakai. 2011. Emscripten: An LLVM-to-JavaScript Compiler. In Proceedings of the ACM International Conference Companion on Object Oriented Programming Systems Languages and Applications Companion (OOPSLA ’11). ACM, New York, NY, USA, 301–312.

Cited By

View all
  • (2024)Semantics of Remote Direct Memory Access: Operational and Declarative Models of RDMA on TSO ArchitecturesProceedings of the ACM on Programming Languages10.1145/36897818:OOPSLA2(1982-2009)Online publication date: 8-Oct-2024
  • (2024)Extending the C/C++ Memory Model with Inline AssemblyProceedings of the ACM on Programming Languages10.1145/36897498:OOPSLA2(1081-1107)Online publication date: 8-Oct-2024
  • (2024)Toast: A Heterogeneous Memory Management SystemProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676944(53-65)Online publication date: 14-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 2, Issue POPL
January 2018
1961 pages
EISSN:2475-1421
DOI:10.1145/3177123
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 December 2017
Published in PACMPL Volume 2, Issue POPL

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Axiomatic
  2. Operational
  3. Relaxed Memory Models
  4. Semantics

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)501
  • Downloads (Last 6 weeks)93
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Semantics of Remote Direct Memory Access: Operational and Declarative Models of RDMA on TSO ArchitecturesProceedings of the ACM on Programming Languages10.1145/36897818:OOPSLA2(1982-2009)Online publication date: 8-Oct-2024
  • (2024)Extending the C/C++ Memory Model with Inline AssemblyProceedings of the ACM on Programming Languages10.1145/36897498:OOPSLA2(1081-1107)Online publication date: 8-Oct-2024
  • (2024)Toast: A Heterogeneous Memory Management SystemProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676944(53-65)Online publication date: 14-Oct-2024
  • (2024)An Axiomatic Basis for Computer Programming on the Relaxed Arm-A Architecture: The AxSL LogicProceedings of the ACM on Programming Languages10.1145/36328638:POPL(604-637)Online publication date: 5-Jan-2024
  • (2024)Analyzing the memory ordering models of the Apple M1Journal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2024.103102149:COnline publication date: 1-Apr-2024
  • (2024)A Denotational Approach to Release/Acquire ConcurrencyProgramming Languages and Systems10.1007/978-3-031-57267-8_5(121-149)Online publication date: 6-Apr-2024
  • (2023)Bringing Compiling Databases to RISC ArchitecturesProceedings of the VLDB Endowment10.14778/3583140.358314216:6(1222-1234)Online publication date: 1-Feb-2023
  • (2023)Static Analysis of Memory Models for SMT EncodingsProceedings of the ACM on Programming Languages10.1145/36228557:OOPSLA2(1618-1647)Online publication date: 16-Oct-2023
  • (2023)Transient-Execution Attacks: A Computer Architect PerspectiveACM Computing Surveys10.1145/360361956:3(1-38)Online publication date: 6-Oct-2023
  • (2023)Putting Weak Memory in Order via a Promising Intermediate RepresentationProceedings of the ACM on Programming Languages10.1145/35912977:PLDI(1872-1895)Online publication date: 6-Jun-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media