[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3190508.3190524acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article
Open access

Reducing DRAM footprint with NVM in Facebook

Published: 23 April 2018 Publication History

Abstract

Popular SSD-based key-value stores consume a large amount of DRAM in order to provide high-performance database operations. However, DRAM can be expensive for data center providers, especially given recent global supply shortages that have resulted in increasing DRAM costs. In this work, we design a key-value store, MyNVM, which leverages an NVM block device to reduce DRAM usage, and to reduce the total cost of ownership, while providing comparable latency and queries-per-second (QPS) as MyRocks on a server with a much larger amount of DRAM. Replacing DRAM with NVM introduces several challenges. In particular, NVM has limited read bandwidth, and it wears out quickly under a high write bandwidth.
We design novel solutions to these challenges, including using small block sizes with a partitioned index, aligning blocks post-compression to reduce read bandwidth, utilizing dictionary compression, implementing an admission control policy for which objects get cached in NVM to control its durability, as well as replacing interrupts with a hybrid polling mechanism. We implemented MyNVM and measured its performance in Facebook's production environment. Our implementation reduces the size of the DRAM cache from 96 GB to 16 GB, and incurs a negligible impact on latency and queries-per-second compared to MyRocks. Finally, to the best of our knowledge, this is the first study on the usage of NVM devices in a commercial data center environment.

References

[1]
Dram prices continue to climb. https://epsnews.com/2017/08/18/dram-prices-continue-climb/.
[2]
Flexible I/O tester. https://github.com/axboe/fio.
[3]
Intel Optane DC p4800x specifications. https://www.intel.com/content/www/us/en/products/memory-storage/solid-state-drives/data-center-ssds/optane-dc-p4800x-series.html.
[4]
Introducing the Samsung PM1725a NVMe SSD. http://www.samsung.com/semiconductor/insights/tech-leadership/brochure-samsung-pm1725a-nvme-ssd/.
[5]
RocksDB wiki. github.com/facebook/rocksdb/wiki//.
[6]
T. G. Armstrong, V. Ponnekanti, D. Borthakur, and M. Callaghan. LinkBench: A database benchmark based on the Facebook social graph. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD '13, pages 1185--1196, New York, NY, USA, 2013. ACM.
[7]
K. A. Bailey, P. Hornyack, L. Ceze, S. D. Gribble, and H. M. Levy. Exploring storage class memory with key value stores. In Proceedings of the 1st Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads, INFLOW '13, pages 4:1--4:8, NewYork, NY, USA, 2013. ACM.
[8]
D. S. Berger, R. K. Sitaraman, and M. Harchol-Balter. AdaptSize: Orchestrating the hot object memory cache in a content delivery network. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pages 483--498, Boston, MA, 2017. USENIX Association.
[9]
N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. Li, M. Marchukov, D. Petrov, L. Puzar, Y.J. Song, and V. Venkataramani. TAO: Facebook's Distributed Data Store for the Social Graph. In Presented as part of the 2013 USENIX Annual Technical Conference (USENIX ATC 13), pages 49--60, San Jose, CA, 2013.
[10]
J. Chen, Q. Wei, C. Chen, and L. Wu. FSMAC: A file system metadata accelerator with non-volatile memory. In Mass Storage Systems and Technologies (MSST), 2013 IEEE 29th Symposium on, pages 1--11. IEEE, 2013.
[11]
S. Chen, P. B. Gibbons, and S. Nath. Rethinking database algorithms for phase change memory. In CIDR, pages 21--31. www.cidrdb.org, 2011.
[12]
Y. COLLET and C. TURNER. Smaller and faster data compression with zstandard, 2016, 2016.
[13]
J. Condit, E. B. Nightingale, C. Frost, E. Ipek, B. Lee, D. Burger, and D. Coetzee. Better i/o through byte-addressable, persistent memory. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, SOSP '09, pages 133--146, New York, NY, USA, 2009. ACM.
[14]
B. Debnath, A. Haghdoost, A. Kadav, M. G. Khatib, and C. Ungureanu. Revisiting hash table design for phase change memory. In Proceedings of the 3rd Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads, INFLOW '15, pages 1:1--1:9, New York, NY, USA, 2015. ACM.
[15]
S. R. Dulloor, S. Kumar, A. Keshavamurthy, P. Lantz, D. Reddy, R. Sankaran, and J. Jackson. System software for persistent memory. In Proceedings of the Ninth European Conference on Computer Systems, EuroSys '14, pages 15:1--15:15, New York, NY, USA, 2014. ACM.
[16]
A. Eisenman, A. Cidon, E. Pergament, O. Haimovich, R. Stutsman, M. Alizadeh, and S. Katti. Flashield: a key-value cache that minimizes writes to flash. CoRR, abs/1702.02588, 2017.
[17]
D. Exchange. DRAM supply to remain tight with its annual bit growth for 2018 forecast at just 19.6www.dramexchange.com.
[18]
W. Hu, G. Li, J. Ni, D. Sun, and K.-L. Tan. B-tree: A predictive B-tree for reducing writes on phase change memory. IEEE Transactions on Knowledge and Data Engineering, 26(10):2368--2381, 2014.
[19]
U. Kang, H.-s. Yu, C. Park, H. Zheng, J. Halbert, K. Bains, S. Jang, and J. S. Choi. Co-architecting controllers and dram to enhance dram process scaling. In The memory forum, pages 1--4, 2014.
[20]
W.-H. Kim, J. Kim, W. Baek, B. Nam, and Y. Won. NVWAL: Exploiting NVRAM in write-ahead logging. SIGPLAN Not., 51(4):385--398, Mar. 2016.
[21]
E. Lee, S. Yoo, J.-E. Jang, and H. Bahn. Shortcut-JFS: A write efficient journaling file system for phase change memory. In Mass Storage Systems and Technologies (MSST), 2012 IEEE 28th Symposium on, pages 1--6. IEEE, 2012.
[22]
S.-H. Lee. Technology scaling challenges and opportunities of memory devices. In Electron Devices Meeting (IEDM), 2016 IEEE International, pages 1--1. IEEE, 2016.
[23]
Y. Matsunobu. Myrocks: A space and write-optimized MySQL database. code. facebook.com/posts/190251048047090/.
[24]
R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling Memcache at Facebook. In Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pages 385--398, Lombard, IL, 2013.
[25]
I. Oukid, J. Lasperas, A. Nica, T. Willhalm, and W. Lehner. Fptree: A hybrid SCM-DRAM persistent and concurrent B-Tree for storage class memory. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD '16, pages 371--386, New York, NY, USA, 2016. ACM.
[26]
W. Shin, Q. Chen, M. Oh, H. Eom, and H. Y. Yeom. OS i/o path optimizations for flash solid-state drives. In 2014 USENIX Annual Technical Conference (USENIX ATC 14), pages 483--488, Philadelphia, PA, 2014. USENIX Association.
[27]
S. Venkataraman, N. Tolia, P. Ranganathan, and R. H. Campbell. Consistent and durable data structures for non-volatile byte-addressable memory. In Proceedings of the 9th USENIX Conference on File and Stroage Technologies, FAST'11, pages 5--5, Berkeley, CA, USA, 2011. USENIX Association.
[28]
F. Xia, D. Jiang, J. Xiong, and N. Sun. HiKV: A hybrid index key-value store for DRAM-NVM memory systems. In 2017 USENIX Annual Technical Conference (USENIX ATC 17), pages 349--362, Santa Clara, CA, 2017. USENIX Association.
[29]
J. Xu and S. Swanson. NOVA: A log-structured file system for hybrid volatile/nonvolatile main memories. In 14th USENIX Conference on File and Storage Technologies (FAST 16), pages 323--338, Santa Clara, CA, 2016. USENIX Association.
[30]
J. Yang, D. B. Minturn, and F. Hady. When poll is better than interrupt. In Proceedings of the 10th USENIX Conference on File and Storage Technologies, FAST'12, pages 3--3, Berkeley, CA, USA, 2012. USENIX Association.
[31]
J. Yang, Q. Wei, C. Chen, C. Wang, K. L. Yong, and B. He. NV-Tree: Reducing consistency cost for NVM-based single level systems. In 13th USENIX Conference on File and Storage Technologies (FAST 15), pages 167--181, Santa Clara, CA, 2015. USENIX Association.
[32]
P. Zuo and Y. Hua. A write-friendly hashing scheme for non-volatile memory systems. In Proceedings of the 33st Symposium on Mass Storage Systems and Technologies, MSST, volume 17, pages 1--10, 2017.

Cited By

View all
  • (2024)Structured storage for ubiquitous operating systemsSCIENTIA SINICA Informationis10.1360/SSI-2022-041554:3(461)Online publication date: 12-Mar-2024
  • (2024)Improving Virtualized I/O Performance by Expanding the Polled I/O Path of LinuxProceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3655038.3665944(31-37)Online publication date: 8-Jul-2024
  • (2024)An LSM Tree Augmented with B+ Tree on Nonvolatile MemoryACM Transactions on Storage10.1145/363347520:1(1-24)Online publication date: 30-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroSys '18: Proceedings of the Thirteenth EuroSys Conference
April 2018
631 pages
ISBN:9781450355841
DOI:10.1145/3190508
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 April 2018

Check for updates

Qualifiers

  • Research-article

Conference

EuroSys '18
Sponsor:
EuroSys '18: Thirteenth EuroSys Conference 2018
April 23 - 26, 2018
Porto, Portugal

Acceptance Rates

EuroSys '18 Paper Acceptance Rate 43 of 262 submissions, 16%;
Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)328
  • Downloads (Last 6 weeks)40
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Structured storage for ubiquitous operating systemsSCIENTIA SINICA Informationis10.1360/SSI-2022-041554:3(461)Online publication date: 12-Mar-2024
  • (2024)Improving Virtualized I/O Performance by Expanding the Polled I/O Path of LinuxProceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3655038.3665944(31-37)Online publication date: 8-Jul-2024
  • (2024)An LSM Tree Augmented with B+ Tree on Nonvolatile MemoryACM Transactions on Storage10.1145/363347520:1(1-24)Online publication date: 30-Jan-2024
  • (2024)Concealing Compression-accelerated I/O for HPC Applications through In Situ Task SchedulingProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629573(981-998)Online publication date: 22-Apr-2024
  • (2024)Enhancing Lossy Compression Through Cross-Field Information for Scientific ApplicationsSC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SCW63240.2024.00046(300-308)Online publication date: 17-Nov-2024
  • (2024)HA-CSD: Host and SSD Coordinated Compression for Capacity and Performance2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00078(825-838)Online publication date: 27-May-2024
  • (2024)Enabling Efficient NVM-Based Text Analytics without Decompression2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00286(3725-3738)Online publication date: 13-May-2024
  • (2024)Range Cache: An Efficient Cache Component for Accelerating Range Queries on LSM - Based Key-Value Stores2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00044(488-500)Online publication date: 13-May-2024
  • (2024)Bandwidth-Effective DRAM Cache for GPU s with Storage-Class Memory2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00021(139-155)Online publication date: 2-Mar-2024
  • (2023)Fast application launch on personal computing/communication devicesProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585965(425-439)Online publication date: 21-Feb-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media