research-article

Open access

The scalable commutativity rule: designing scalable software for multicore processors

Authors:

Eddie KohlerAuthors Info & Claims

SOSP '13: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

Pages 1 - 17

https://doi.org/10.1145/2517349.2522712

Published: 03 November 2013 Publication History

PDF eReader

Abstract

What fundamental opportunities for scalability are latent in interfaces, such as system call APIs? Can scalability opportunities be identified even before any implementation exists, simply by considering interface specifications? To answer these questions this paper introduces the following rule: Whenever interface operations commute, they can be implemented in a way that scales. This rule aids developers in building more scalable software starting from interface design and carrying on through implementation, testing, and evaluation.

To help developers apply the rule, a new tool named Commuter accepts high-level interface models and generates tests of operations that commute and hence could scale. Using these tests, Commuter can evaluate the scalability of an implementation. We apply Commuter to 18 POSIX calls and use the results to guide the implementation of a new research operating system kernel called sv6. Linux scales for 68% of the 13,664 tests generated by Commuter for these calls, and Commuter finds many problems that have been observed to limit application scalability. sv6 scales for 99% of the tests.

Supplementary Material

MP4 File (d1-01-austin-clements.mp4)

Download
1216.58 MB

References

[1]

H. Attiya, E. Hillel, and A. Milani. Inherent limitations on disjoint-access parallel implementations of transactional memory. In Proceedings of the 21st Annual ACM Symposium on Parallelism in Algorithms and Architectures, Calgary, Canada, August 2009.

Digital Library

Google Scholar

[2]

H. Attiya, R. Guerraoui, D. Hendler, P. Kuznetsov, M. M. Michael, and M. Vechev. Laws of order: Expensive synchronization in concurrent algorithms cannot be eliminated. In Proceedings of the 38th ACM Symposium on Principles of Programming Languages, Austin, TX, January 2011.

Digital Library

Google Scholar

[3]

A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and A. Singhania. The Multikernel: A new OS architecture for scalable multicore systems. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP), Big Sky, MT, October 2009.

Digital Library

Google Scholar

[4]

F. Bellard et al. QEMU. http://www.qemu.org/.

Google Scholar

[5]

D. J. Bernstein. Some thoughts on security after ten years of qmail 1.0. In Proceedings of the ACM Workshop on Computer Security Architecture, Fairfax, VA, November 2007.

Digital Library

Google Scholar

[6]

P. A. Bernstein and N. Goodman. Concurrency control in distributed database systems. ACM Computing Surveys, 13(2):185--221, June 1981.

Digital Library

Google Scholar

[7]

S. Boyd-Wickizer. Optimizing Communication Bottlenecks in Multiprocessor Operating System Kernels. PhD thesis, Massachusetts Institute of Technology, February 2014.

Google Scholar

[8]

S. Boyd-Wickizer, H. Chen, R. Chen, Y. Mao, M. F. Kaashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y. Dai, Y. Zhang, and Z. Zhang. Corey: An operating system for many cores. In Proceedings of the 8th Symposium on Operating Systems Design and Implementation (OSDI), San Diego, CA, December 2008.

Digital Library

Google Scholar

[9]

S. Boyd-Wickizer, A. Clements, Y. Mao, A. Pesterev, M. F. Kaashoek, R. Morris, and N. Zeldovich. An analysis of Linux scalability to many cores. In Proceedings of the 9th Symposium on Operating Systems Design and Implementation (OSDI), Vancouver, Canada, October 2010.

Digital Library

Google Scholar

[10]

C. Cadar, V. Ganesh, P. M. Pawlowski, D. L. Dill, and D. R. Engler. EXE: Automatically generating inputs of death. In Proceedings of the 13th ACM Conference on Computer and Communications Security, 2006.

Digital Library

Google Scholar

[11]

C. Cadar, D. Dunbar, and D. Engler. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th Symposium on Operating Systems Design and Implementation (OSDI), San Diego, CA, December 2008.

Digital Library

Google Scholar

[12]

B. Cantrill and J. Bonwick. Real-world concurrency. Communications of the ACM, 51(11):34--39, 2008.

Digital Library

Google Scholar

[13]

K. Claessen and J. Hughes. QuickCheck: A lightweight tool for random testing of Haskell programs. In Proceedings of the 5th ACM SIGPLAN International Conference on Functional Programming, Montreal, Canada, September 2000.

Digital Library

Google Scholar

[14]

A. T. Clements, M. F. Kaashoek, and N. Zeldovich. Concurrent address spaces using RCU balanced trees. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), London, UK, March 2012.

Digital Library

Google Scholar

[15]

A. T. Clements, M. F. Kaashoek, and N. Zeldovich. RadixVM: Scalable address spaces for multithreaded applications. In Proceedings of the ACM EuroSys Conference, Prague, Czech Republic, April 2013.

Digital Library

Google Scholar

[16]

J. Corbet. The search for fast, scalable counters, May 2010. http://lwn.net/Articles/170003/.

Google Scholar

[17]

J. Corbet. Dcache scalability and RCU-walk, April 23, 2012. http://lwn.net/Articles/419811/.

Google Scholar

[18]

R. Cox, M. F. Kaashoek, and R. T. Morris. Xv6, a simple Unix-like teaching operating system. http://pdos.csail.mit.edu/6.828/2012/xv6.html.

Google Scholar

[19]

L. de Moura and N. Bjørner. Z3: An efficient SMT solver. In Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, Budapest, Hungary, March--April 2008.

Digital Library

Google Scholar

[20]

DWARF Debugging Information Format Committee. DWARF debugging information format, version 4, June 2010.

Google Scholar

[21]

F. Ellen, Y. Lev, V. Luchango, and M. Moir. SNZI: Scalable nonzero indicators. In Proceedings of the 26th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, Portland, OR, August 2007.

Digital Library

Google Scholar

[22]

P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, Chicago, IL, June 2005.

Digital Library

Google Scholar

[23]

M. Herlihy and E. Koskinen. Transactional boosting: A methodology for highly-concurrent transactional objects. In Proceedings of the 13th ACM Symposium on Principles and Practice of Parallel Programming, Salt Lake City, UT, February 2008.

Digital Library

Google Scholar

[24]

M. P. Herlihy and J. M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages Systems, 12(3):463--492, 1990.

Digital Library

Google Scholar

[25]

D. Howells. Extended file stat functions, Linux patch. https://lkml.org/lkml/2010/7/14/539.

Google Scholar

[26]

A. Israeli and L. Rappoport. Disjoint-access-parallel implementations of strong shared memory primitives. In Proceedings of the 13th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, Los Angeles, CA, August 1994.

Digital Library

Google Scholar

[27]

P. Koopman, A. Alimarine, J. Tretmans, and R. Plasmeijer. Gast: Generic automated software testing. In Proceedings of the 14th International Workshop on the Implementation of Functional Languages, Madrid, Spain, September 2002.

Digital Library

Google Scholar

[28]

C. Lameter. Effective synchronization on Linux/NUMA systems. In Gelato Conference, May 2005. http://www.lameter.com/gelato2005.pdf.

Google Scholar

[29]

P. E. McKenney. Differential profilng. Software: Practice and Experience, 29(3):219--234, 1999.

Digital Library

Google Scholar

[30]

P. E. McKenney. Concurrent code and expensive instructions. https://lwn.net/Articles/423994/, January 2011.

Google Scholar

[31]

P. E. McKenney, D. Sarma, A. Arcangeli, A. Kleen, O. Krieger, and R. Russell. Read-copy update. In Proceedings of the Linux Symposium, Ottawa, Canada, June 2002.

Google Scholar

[32]

J. M. Mellor-Crummey and M. L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Transactions on Computer Systems, 9(1):21--65, 1991.

Digital Library

Google Scholar

[33]

M. S. Papamarcos and J. H. Patel. A low-overhead coherence solution for multiprocessors with private cache memories. In Proceedings of the 11th Annual International Symposium on Computer Architecture, Ann Arbor, MI, June 1984.

Digital Library

Google Scholar

[34]

P. Prabhu, S. Ghosh, Y. Zhang, N. P. Johnson, and D. I. August. Commutative set: A language extension for implicit parallel programming. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation, San Jose, CA, June 2011.

Digital Library

Google Scholar

[35]

M. C. Rinard and P. C. Diniz. Commutativity analysis: A new analysis technique for parallelizing compilers. ACM Transactions on Programming Languages and Systems, 19(6):942--991, November 1997.

Digital Library

Google Scholar

[36]

A. Roy, S. Hand, and T. Harris. Exploring the limits of disjoint access parallelism. In Proceedings of the 1st USENIX Workshop on Hot Topics in Parallelism, Berkeley, CA, March 2009.

Digital Library

Google Scholar

[37]

K. Sen, D. Marinov, and G. Agha. CUTE: A concolic unit testing engine for C. In Proceedings of the 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Lisbon, Portugal, September 2005.

Digital Library

Google Scholar

[38]

M. Shapiro, N. Preguica, C. Baquero, and M. Zawirski. Conflict-free replicated data types. In Proceedings of the 13th International Conference on Stabilization, Safety, and Security of Distributed Systems, Grenoble, France, October 2011.

Digital Library

Google Scholar

[39]

M. Shapiro, N. Preguica, C. Baquero, and M. Zawirski. Convergent and commutative replicated data types. Bulletin of the EATCS, 104:67--88, June 2011.

Google Scholar

[40]

G. L. Steele, Jr. Making asynchronous parallelism safe for the world. In Proceedings of the 17th ACM Symposium on Principles of Programming Languages, San Francisco, CA, January 1990.

Digital Library

Google Scholar

[41]

G. Tene, B. Iyengar, and M. Wolf. C4: The continuously concurrent compacting collector. SIGPLAN Notices, 46(11):79--88, June 2011.

Digital Library

Google Scholar

[42]

W. E. Weihl. Commutativity-based concurrency control for abstract data types. IEEE Transactions on Computers, 37(12):1488--1505, December 1988.

Digital Library

Google Scholar

[43]

D. Wentzlaff and A. Agarwal. Factored operating systems (fos): The case for a scalable operating system for multicores. ACM SIGOPS Operating System Review, 43(2):76--85, 2009.

Digital Library

Google Scholar

Cited By

View all

Cvetković LCosta FDjokic MFriedman MKlimovic AWitchel EArpaci-Dusseau ARossbach CKeeton K(2024)Dirigent: Lightweight Serverless OrchestrationProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695966(369-384)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3694715.3695966
Haeberlen APhan LMcGuire M(2023)Metaverse as a ServiceProceedings of the 2023 ACM Symposium on Cloud Computing10.1145/3620678.3624662(298-307)Online publication date: 30-Oct-2023
https://dl.acm.org/doi/10.1145/3620678.3624662
Johnson ELaufer EZhao ZGohman DNarayan SSavage SStefan DBrown F(2023)WaVe: a verifiably secure WebAssembly sandboxing runtime2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179357(2940-2955)Online publication date: May-2023
https://doi.org/10.1109/SP46215.2023.10179357
Show More Cited By

Index Terms

The scalable commutativity rule: designing scalable software for multicore processors
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Process management
        Concurrency control

Recommendations

The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors

What opportunities for multicore scalability are latent in software interfaces, such as system call APIs? Can scalability challenges and opportunities be identified even before any implementation exists, simply by considering interface specifications? ...
The scalable commutativity rule: designing scalable software for multicore processors

Developing software that scales on multicore processors is an inexact science dominated by guesswork, measurement, and expensive cycles of redesign and reimplementation. Current approaches are workload-driven and, hence, can reveal scalability ...
Scalable Rule Lists Learning with Sampling
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Learning interpretable models has become a major focus of machine learning research, given the increasing prominence of machine learning in socially important decision-making. Among interpretable models, rule lists are among the best-known and easily ...

Reviews

Reviewer: Veronica Lagrange

Connections between commutativity and concurrency are well known and well established in the computing literature. Often, commutativity is used to demonstrate the feasibility of executing instructions concurrently. Clements et al. introduce the idea of using commutativity as an interface design requirement to improve scalability. In their words, “whenever interface operations commute, they can be implemented in a way that scales.” Concurrent operations commute if the aftermath is independent of their execution order. This means that it is not possible to tell a posteriori in which order they were executed. This is achieved by conflict-free implementation, the “holy grail” of scalability. This paper demonstrates that there are some opportunities out there for more scalable interfaces. After presenting some definitions and proofs, the authors describe a tool for automatically developing test cases and testing if a particular implementation commutes. As an example, this tool is applied to a model of various POSIX file system and memory system calls to determine their commutativity. Those are compared and benchmarked against equivalent routines from an experimental operating system implemented according to the COMMUTE rules defined in the paper. Though performance tradeoffs are only briefly discussed, these preliminary results look promising. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

SOSP '13: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

November 2013

498 pages

ISBN:9781450323888

DOI:10.1145/2517349

General Chair:
Michael Kaminsky
Intel Labs
,
Program Chair:
Mike Dahlin
Google and UT Austin

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2013

Check for updates

Qualifiers

Research-article

Funding Sources

Conference

SOSP '13

Sponsor:

SIGOPS

SOSP '13: ACM SIGOPS 24th Symposium on Operating Systems Principles

November 3 - 6, 2013

Pennsylvania, Farminton

Acceptance Rates

Overall Acceptance Rate 174 of 961 submissions, 18%

Upcoming Conference

SOSP '25

Sponsor:
sigops

ACM SIGOPS 31st Symposium on Operating Systems Principles

October 13 - 16, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

110
Total Citations
View Citations
9,387
Total Downloads

Downloads (Last 12 months)242
Downloads (Last 6 weeks)49

Reflects downloads up to 12 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Cvetković LCosta FDjokic MFriedman MKlimovic AWitchel EArpaci-Dusseau ARossbach CKeeton K(2024)Dirigent: Lightweight Serverless OrchestrationProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695966(369-384)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3694715.3695966
Haeberlen APhan LMcGuire M(2023)Metaverse as a ServiceProceedings of the 2023 ACM Symposium on Cloud Computing10.1145/3620678.3624662(298-307)Online publication date: 30-Oct-2023
https://dl.acm.org/doi/10.1145/3620678.3624662
Johnson ELaufer EZhao ZGohman DNarayan SSavage SStefan DBrown F(2023)WaVe: a verifiably secure WebAssembly sandboxing runtime2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179357(2940-2955)Online publication date: May-2023
https://doi.org/10.1109/SP46215.2023.10179357
Kim CChoi EHan MLee SKim J(2022)Performance Analysis of RCU-Style Non-Blocking Synchronization Mechanisms on a Manycore-Based Operating SystemApplied Sciences10.3390/app1207345812:7(3458)Online publication date: 29-Mar-2022
https://doi.org/10.3390/app12073458
Geng JSivaraman APrabhakar BRosenblum M(2022)NezhaProceedings of the VLDB Endowment10.14778/3574245.357425016:4(629-642)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.14778/3574245.3574250
Ganesan AAlagappan RRebello AArpaci-Dusseau AArpaci-Dusseau R(2022)Exploiting Nil-external Interfaces for Fast Replicated StorageACM Transactions on Storage10.1145/354282118:3(1-35)Online publication date: 2-Sep-2022
https://dl.acm.org/doi/10.1145/3542821
Jia XZhang JYu BQian XQi ZGuan H(2022)GiantVM: A Novel Distributed Hypervisor for Resource Aggregation with DSM-aware OptimizationsACM Transactions on Architecture and Code Optimization10.1145/350525119:2(1-27)Online publication date: 7-Mar-2022
https://dl.acm.org/doi/10.1145/3505251
Wang JLi CMa KHuo JYan FFeng XXu Y(2021)AUTOGRProceedings of the VLDB Endowment10.14778/3461535.346154114:9(1517-1530)Online publication date: 22-Oct-2021
https://dl.acm.org/doi/10.14778/3461535.3461541
Liu JRebello ADai YYe CKannan SArpaci-Dusseau AArpaci-Dusseau R(2021)Scale and Performance in a Filesystem Semi-MicrokernelProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483581(819-835)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3477132.3483581
Ganesan AAlagappan RArpaci-Dusseau AArpaci-Dusseau R(2021)Exploiting Nil-Externality for Fast Replicated StorageProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483543(440-456)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3477132.3483543
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations