More Web Proxy on the site http://driver.im/

research-article

Delegated persist ordering

Authors:

Aasheesh Kolli,

Stephan Diestelhorst,

Thomas F. WenischAuthors Info & Claims

MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture

Article No.: 58, Pages 1 - 13

Published: 15 October 2016 Publication History

Abstract

Systems featuring a load-store interface to persistent memory (PM) are expected soon, making in-memory persistent data structures feasible. Ensuring persistent data structure recoverability requires constraints on the order PM writes become persistent. But, current memory systems reorder writes, providing no such guarantees. To complement their upcoming 3D XPoint memory, Intel has announced new instructions to enable programmer control of data persistence. We describe the semantics implied by these instructions, an ordering model we call synchronous ordering.

Synchronous ordering (SO) enforces order by stalling execution when PM write ordering is required, exposing PM write latency on the execution critical path. It incurs an average slowdown of 7.21x over volatile execution without ordering in PM-write-intensive benchmarks. SO tightly couples enforcing order and flushing writes to PM, but this tight coupling is unneeded in many recoverable software systems. Instead, we propose delegated ordering, wherein ordering requirements are communicated explicitly to the PM controller, fully decoupling PM write ordering from volatile execution and cache management. We demonstrate that delegated ordering can bring performance within 1.93x of volatile execution, improving over SO by 3.73x.

References

[1]

Intel and Micron, "Intel and micron produce breakthrough memory technology,"2015, http://newsroom.intel.com/community/intel_newsroom/blog/2015/07/28/intel-and-micron-produce-breakthrough-memory-technology.

[2]

C. World, "Hp and sandisk partner to bring storage-class memory to market," 2015, http://www.computerworld.com/article/2990809/data-storage-solutions/hp-sandisk-partner-to-bring-storage-class-memory-to-market.html.

[3]

Intel, "Intel architecture instruction set extensions programming reference (319433--022)," 2014, https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf.

[4]

S. Pelley, P. M. Chen, and T. F. Wenisch, "Memory persistency," in Proceedings of the 41st International Symposium on Computer Architecture, 2014.

Digital Library

[5]

J. Zhao, S. Li, D. H. Yoon, Y. Xie, and N. P. Jouppi, "Kiln: Closing the performance gap between systems with and without persistence support," in Proceedings of 46th International Symposium on Microarchitecure, 2013.

Digital Library

[6]

H. Volos, A. J. Tack, and M. M. S. E, "Mnemosyne: Leightweight persistent memory," in Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, 2011.

Digital Library

[7]

J. Condit, E. B. Nightingale, C. Frost, E. Ipek, B. Lee, D. Burger, and D. Coetzee, "Better i/o through byte-addressable, persistent memory," in Proceedings of the 22nd ACM Symposium on Operating Systems Principles, 2009.

Digital Library

[8]

J. Zhao, O. Mutlu, and Y. Xie, "Firm: Fair and high-performance memory control for peristent memory systems," in Proceedings of 47th International Symposium on Microarchitecure, 2014.

Digital Library

[9]

V. Chidambaram, T. S. Pillai, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau, "Optimistic crash consistency," in Proceedings of the 24th ACM Symposium on Operating Systems Principles, 2013.

Digital Library

[10]

J. Coburn, A. M. Caulfield, A. Akel, L. M. Grupp, R. K. Gupta, R. Jhala, and S. Swanson, "Nv-heaps: Making persistent objects fast and safe with next-generation, non-volatile memories," in Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, 2011.

Digital Library

[11]

D. R. Chakrabarti, H.-J. Boehm, and K. Bhandari, "Atlas: leveraging locks for non-volatile memory consistency," in Proceedings of the Conference on Object-Oriented Programming, Systems, Languages, and Applications, 2014.

Digital Library

[12]

H.-J. Boehm and D. R. Chakrabarti, "Persistence programming models for non-volatile memory," Hewlett-Packard, Tech. Rep. HPL-2015-59, 2015.

[13]

A. Joshi, V. Nagarajan, M. Cintra, and S. Viglas, "Efficient persist barriers for multicores," in Proceedings of the international symposium on Microarchitecture, 2015.

Digital Library

[14]

A. Kolli, S. Pelley, A. Saidi, P. M. Chen, and T. F. Wenisch, "High-performance transactions for persistent memories," in Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016.

Digital Library

[15]

J. Izraelevitz, T. Kelly, and A. Kolli, "Failure-atomic persistent memory updates via justdo logging," in Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016.

Digital Library

[16]

T. Wang and R. Johnson, "Scalable logging through emerging nonvolatile memory," Proceedings of the VLDB Endowment, vol. 7, no. 10, pp. 865--876, June 2014.

Digital Library

[17]

D. Narayanan and O. Hodson, "Whole-system persistence," in Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, 2012.

Digital Library

[18]

F. Nawab, D. Chakrabarti, T. Kelly, and C. B. M. III, "Procrastination beats prevention: Timely sufficient persistence for efficient crash resilience," Hewlett-Packard, Tech. Rep. HPL-2014-70, December 2014.

[19]

G. R. Ganger, M. K. McKusick, C. A. N. Soules, and Y. N. Patt, "Soft Updates: A Solution to the Metadata Update Problem in File Systems," ACM Transactions on Computer Systems, vol. 18, no. 2, May 2000.

Digital Library

[20]

C. Blundell, M. M. Martin, and T. F. Wenisch, "Invisifence: Performance-transparent memory ordering in conventional multiprocessors," in Proceedings of the 36th Annual International Symposium on Computer Architecture, 2009.

Digital Library

[21]

T. F. Wenisch, A. Ailamaki, B. Falsafi, and A. Moshovos, "Mechanisms for store-wait-free multiprocessors," in Proceedings of the 34th Annual International Symposium on Computer Architecture, 2007.

Digital Library

[22]

L. Ceze, J. Tuck, P. Montesinos, and J. Torrellas, "Bulksc: Bulk enforcement of sequential consistency," in Proceedings of the 34th Annual International Symposium on Computer Architecture, 2007.

Digital Library

[23]

C. Gniady, B. Falsafi, and T. N. Vijaykumar, "Is sc + ilp = rc?" in Proceedings of the 26th Annual International Symposium on Computer Architecture, 1999.

Digital Library

[24]

P. Ranganathan, V. S. Pai, and S. V. Adve, "Using speculative retirement and larger instruction windows to narrow the performance gap between memory consistency models," in Proceedings of the Ninth Annual ACM Symposium on Parallel Algorithms and Architectures, 1997.

Digital Library

[25]

K. Gharachorloo, A. Gupta, and J. Hennessy, "Two techniques to enhance the performance of memory consistency models," in In Proceedings of the 1991 International Conference on Parallel Processing, 1991.

[26]

N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood, "The gem5 simulator," SIGARCH Comput. Archit. News, vol. 39, no. 2, pp. 1--7, Aug. 2011.

Digital Library

[27]

S. R. Dulloor, S. Kumar, A. Keshavamurthy, P. Lantz, D. Reddy, R. Sankaran, and J. Jackson, "System software for persistent memory," in Proceedings of the 9th European Conference on Computer Systems, 2014.

Digital Library

[28]

K. Bhandari, D. R. Chakrabarti, and H.-J. Boehm, "Implications of cpu caching on byte-addressable non-volatile memory programming," Hewlett-Packard, Tech. Rep. HPL-2012-236, December 2012.

[29]

B. C. Lee, E. Ipek, O. Mutlu, and D. Burger, "Architecting phase change memory as a scalable dram alternative," in Proceedings of the 36th Annual International Symposium on Computer Architecture, 2009.

Digital Library

[30]

S. V. Adve and K. Gharachorloo, "Shared memory consistency models: A tutorial," IEEE Computer, vol. 29, no. 12, pp. 66--76, December 1996.

Digital Library

[31]

ARM, "Armv8-a architecture evolution," 2016, https://community.arm.com/groups/processors/blog/2016/01/05/armv8-a-architecture-evolution.

[32]

ARM, ARM Architecture Reference Manual. ARM, 2007.

[33]

A. Kolli, S. Pelley, A. Saidi, P. M. Chen, and T. F. Wenisch, "Persistency programming 101," 2015, http://nvmw.ucsd.edu/2015/assets/abstracts/33.

[34]

M. Luc, S. Inria, Sarkar, and P. Sewell, "A tutorial introduction to the arm and power relaxed memory models," 2012.

[35]

D. Lustig, C. Trippel, M. Pellauer, and M. Martonosi, "Armor: Defending against memory consistency model mismatches in heterogeneous architectures," in Proceedings of the 42Nd Annual International Symposium on Computer Architecture, 2015.

Digital Library

[36]

S. Sarkar, P. Sewell, J. Alglave, L. Maranget, and D. Williams, "Understanding power multiprocessors," in Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation, 2011.

Digital Library

[37]

J. Alglave, L. Maranget, and M. Tautschnig, "Herding cats: Modelling, simulation, testing, and data-mining for weak memory," in Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2014.

Digital Library

[38]

ARM, "Barrier litmus tests and cookbook," 2009, http://infocenter.arm.com/help/topic/com.arm.doc.genc007826/Barrier_Litmus_Tests_and_Cookbook_A08.pdf.

[39]

R. Ausavarungnirun, K. K.-W. Chang, L. Subramanian, G. H. Loh, and O. Mutlu, "Staged memory scheduling: schieving high performance and scalability in heterogeneous systems," in In Proceedings of the International Symposium on Computer Architecture, 2012.

Digital Library

[40]

Y. Kim, D. Han, O. MUtlu, and M. Harchol-Balter, "Atlas: A scalable and high-performance scheduling algorithm for multiple memory controllers," in In Proceedings of the International Symposium on High Performance Computer Architecture, 2010.

[41]

Y. Kim, M. Papamichael, O. Mutlu, and M. Harchol-Balter, "Thread cluster memory scheduling: Exploiting differences in memory access behavior," in In Proceedings of the International Symposium on Microarchitecture, 2010.

Digital Library

[42]

R.-S. Liu, D.-Y. Shen, C.-L. Yang, S.-C. Yu, and C.-Y. M. Wang, "Nvm duet: unified working memory and persistent store architecture," in Proceedings of the international conference on Architectural Support for Programming Languages an Operating Systems, 2014.

Digital Library

[43]

T. Harris, J. Larus, and R. Rajwar, Transactional memory. Morgan & Claypool Publishers, 2010.

Digital Library

[44]

C. Xu, D. Niu, N. Muralimanohar, R. Balasubramonian, T. Zhang, S. Yu, and Y. Xie, "Overcoming the challenges of crossbar resistive memory architectures," in In Proceedings of the International Symposium on High Performance Computer Architecture, 2015.

[45]

S. Neuvonen, A. Wolski, M. Manner, and V. Raatikka, "Telecom application transaction processing benchmark," 2011, http://tatpbenchmark.sourceforge.net/.

[46]

T. P. P. C. (TPC), "Tpc benchmark b," 2010, http://www.tpc.org/tpc_documents_current_versions/pdf/tpc-c_v5-11.pdf.

[47]

M. K. Qureshi, M. M. Franchescini, V. Srinivasan, L. A. Lastras, B. Abali, and J. Karidis, "Enhancing lifetime and security of pcm-based main memory with start-gap wear leveling," in Proceedings of the International Symposium on Microarchitecture, 2009.

Digital Library

[48]

M. K. Qureshi, A. Seznec, L. A. Lastras, and M. M. Franchescini, "Practical and secure pcm systems by online detection of malicious write streams," in Proceedings of the 17th International Symposium on High Performance Computer Architecture, 2011.

Digital Library

[49]

P. Zhou, B. Zhao, J. Yang, and Y. Zhang, "A durable and energy efficient main memory using phase change memory technology," in Proceedings of the 36th International Symposium on Computer Architecture, 2009.

Digital Library

[50]

J. Yue and Y. Zhu, "Accelerating write by exploiting pcm asymmetries," in Proceedings of the International Symposium on High Performance Computer Architecture, 2013.

Digital Library

[51]

S. Cho and H. Lee, "Flip-n-write: a simple deterministic technique to improve pram write performance, energy and endurance," in Proceedings of the International Symposium on Microarchitecture, 2009.

Digital Library

[52]

A. Hay, K. Strauss, T. Sherwood, G. H. Loh, and D. Burger, "Preventing pcm banks from seizing too much power," in Proceedings of the International Symposium on Microarchitecture, 2011.

Digital Library

[53]

M. Awasthi, M. Shevgoor, K. Sudan, B. Rajendran, and R. Balasubramonian, "Efficient scrub mechanisms for error-prone emerging memories," in Proceedings of the International Symposium on High Performance Computer Architecture, 2012.

Digital Library

[54]

A. Chatzistergiou, M. Cintra, and S. D. Vaglis, "Rewind: Recovery write-ahead system for in-memory non-volatile data structures," Proceedings of the VLDB Endowment, vol. 8, no. 5, 2015.

Digital Library

[55]

X. Wu and A. L. N. Reddy, "Scmfs: a file system for storage class memory," in In Proceedings of the International Conference for High Performance Computing, 2011.

Digital Library

[56]

Y. Lu, J. Shu, L. Sun, and O. Mutlu, "Loose-ordering consistency for persistent memory," in Proceedings of the 32nd IEEE International Conference on Computer Design, 2014.

Cited By

Pandey SKamath ABasu AAamodt TJerger NSwift M(2023)Scoped Buffered Persistency Model for GPUsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575749(688-701)Online publication date: 27-Jan-2023
https://dl.acm.org/doi/10.1145/3575693.3575749
Seneviratne YSeemakhupt KLiu SKhan SFedorova ANarayanan DDi Luna GQuerzoni L(2023)NearPM: A Near-Data Processing System for Storage-Class ApplicationsProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587456(751-767)Online publication date: 8-May-2023
https://dl.acm.org/doi/10.1145/3552326.3587456
Qiu HLiu SSong XKhan SPekhimenko GKloeckner AMoreira J(2022)PaviseProceedings of the International Conference on Parallel Architectures and Compilation Techniques10.1145/3559009.3569662(109-123)Online publication date: 8-Oct-2022
https://dl.acm.org/doi/10.1145/3559009.3569662
Show More Cited By

Recommendations

Relaxed persist ordering using strand persistency
ISCA '20: Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture

Emerging persistent memory (PM) technologies promise the performance of DRAM with the durability of Flash. Several language-level persistency models have emerged recently to aid programming recoverable data structures in PM. Unfortunately, these ...
High-Performance Transactions for Persistent Memories
ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems

Emerging non-volatile memory (NVRAM) technologies offer the durability of disk with the byte-addressability of DRAM. These devices will allow software to access persistent data structures directly in NVRAM using processor loads and stores, however, ...
Language-level persistency
ISCA '17: Proceedings of the 44th Annual International Symposium on Computer Architecture

The commercial release of byte-addressable persistent memories, such as Intel/Micron 3D XPoint memory, is imminent. Ongoing research has sought mechanisms to allow programmers to implement recoverable data structures in these new main memories. Ensuring ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture

October 2016

816 pages

General Chairs:
Wei-Chung Hsu
NTU, Taiwan
,
Chia-Lin Yang
NTU, Taiwan
,
Program Chairs:
Mikko Lipasti
Univ. Wisconsin
,
Hsien-Hsin Lee
TSMC, Taiwan

Sponsors

SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
IEEE-CS\DATC: IEEE Computer Society

Publisher

IEEE Press

Publication History

Published: 15 October 2016

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MICRO-49

Sponsor:

SIGMICRO
IEEE-CS\DATC

MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture

October 15 - 19, 2016

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

34
Total Citations
View Citations
184
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Pandey SKamath ABasu AAamodt TJerger NSwift M(2023)Scoped Buffered Persistency Model for GPUsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575749(688-701)Online publication date: 27-Jan-2023
https://dl.acm.org/doi/10.1145/3575693.3575749
Seneviratne YSeemakhupt KLiu SKhan SFedorova ANarayanan DDi Luna GQuerzoni L(2023)NearPM: A Near-Data Processing System for Storage-Class ApplicationsProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587456(751-767)Online publication date: 8-May-2023
https://dl.acm.org/doi/10.1145/3552326.3587456
Qiu HLiu SSong XKhan SPekhimenko GKloeckner AMoreira J(2022)PaviseProceedings of the International Conference on Parallel Architectures and Compilation Techniques10.1145/3559009.3569662(109-123)Online publication date: 8-Oct-2022
https://dl.acm.org/doi/10.1145/3559009.3569662
Pandey SKamath ABasu AFalsafi BFerdman MLu SWenisch T(2022)GPM: leveraging persistent memory from a GPUProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507758(142-156)Online publication date: 28-Feb-2022
https://dl.acm.org/doi/10.1145/3503222.3507758
Reidys BHuang JLee JAgrawal KSpear M(2022)Understanding and detecting deep memory persistency bugs in NVM programs with DeepMCProceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3503221.3508427(322-336)Online publication date: 2-Apr-2022
https://dl.acm.org/doi/10.1145/3503221.3508427
Vemmou MDaglis A(2021)COSPlay: Leveraging Task-Level Parallelism for High-Throughput Synchronous PersistenceMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480075(86-99)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3466752.3480075
Liu SMahar SRay BKhan SSherwood TBerger EKozyrakis C(2021)PMFuzz: test case generation for persistent memory programsProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446691(487-502)Online publication date: 19-Apr-2021
https://dl.acm.org/doi/10.1145/3445814.3446691
Mahar SLiu SSeemakhupt KYoung VKhan S(2021)Write Prediction for Persistent Memory SystemsProceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques10.1109/PACT52795.2021.00025(242-257)Online publication date: 26-Sep-2021
https://dl.acm.org/doi/10.1109/PACT52795.2021.00025
Wang ZChoo CKozuch MMowry TPekhimenko GSeshadri VSkarlatos DMartínez JDuato JJohn L(2021)NVOverlayProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00046(498-511)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1109/ISCA52012.2021.00046
Shull TVougioukas INikoleris NElsasser WTorrellas JMartínez JDuato JJohn L(2021)Execution dependence extension (EDE)Proceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00043(456-469)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1109/ISCA52012.2021.00043
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents