[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3458817.3480852acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

LogECMem: coupling erasure-coded in-memory key-value stores with parity logging

Published: 13 November 2021 Publication History

Abstract

In-memory key-value stores are often used to speed up Big Data workloads on modern HPC clusters. To maintain their high availability, erasure coding has been recently adopted as a low-cost redundancy scheme instead of replication. Existing erasure-coded update schemes, however, have either low performance or high memory overhead. In this paper, we propose a novel parity logging-based architecture, HybridPL, which creates a hybrid of in-place update (for data and XOR parity chunks) and log-based update (for the remaining parity chunks), so as to balance the update performance and memory cost, while maintaining efficient single-failure repairs. We realize HybridPL as an in-memory key-value store called LogECMem, and further design efficient repair schemes for multiple failures. We prototype LogECMem and conduct experiments on different workloads. We show that LogECMem achieves better update performance over existing erasure-coded update schemes with low memory overhead, while maintaining high basic I/O and repair performance.

Supplementary Material

MP4 File (LogECMem Coupling Erasure-Coded In-Memory Key-Value Stores with Parity Logging 232 Afternoon 2.mp4.mp4)
Presentation video

References

[1]
Amazon Elastic Block Store. http://aws.amazon.com/ebs.
[2]
Amazon Elastic Compute Cloud (EC2). http://aws.amazon.com/ec2.
[3]
Amazon Elasticache. https://docs.aws.amazon.com/elasticache.
[4]
Ddr4 sdram. https://en.wikipedia.org/wiki/DDR4_SDRAM.
[5]
Fatcache. https://github.com/twitter/fatcache.
[6]
Hard disk drive performance characteristics. https://en.wikipedia.org/wiki/Hard_disk_drive_performance_characteristics.
[7]
Intel ISA-L. https://github.com/intel/isa-l.
[8]
LibMemcached. https://libmemcached.org.
[9]
Memcached. https://memcached.org.
[10]
Twittercache. https://github.com/alexpghayes/twittercache.
[11]
Vastdata. https://vastdata.com/providing-resilience-efficiently-part-ii/.
[12]
M. K. Aguilera, R. Janakiraman, and L. Xu. Using erasure codes efficiently for storage in a distributed system. In Proc. of IEEE/IFIP DSN, pages 336--345. IEEE, 2005.
[13]
A. Anwar, Y. Cheng, H. Huang, J. Han, H. Sim, D. Lee, F. Douglis, and A. R. Butt. Bespokv: Application tailored scale-out key-value stores. In Proc. of IEEE SC, pages 14--29. IEEE, 2018.
[14]
B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny. Workload analysis of a large-scale key-value store. In Proc. of ACM SIGMETRICS, pages 53--64, 2012.
[15]
S. Balakrishnan, R. Black, A. Donnelly, P. England, A. Glass, D. Harper, S. Legtchenko, A. Ogus, E. Peterson, and A. Rowstron. Pelican: A building block for exascale cold data storage. In Proc. of USENIX OSDI, 2014.
[16]
J. C. Chan, Q. Ding, P. P. Lee, and H. H. Chan. Parity logging with reserved space: Towards efficient updates and recovery in erasure-coded clustered storage. In Proc. of USENIX FAST, pages 163--176, 2014.
[17]
Y. L. Chen, S. Mu, J. Li, C. Huang, J. Li, A. Ogus, and D. Phillips. Giza: Erasure coding objects across global data centers. In Proc. of USENIX ATC, pages 539--551, 2017.
[18]
L. Cheng, Y. Hu, and P. P. Lee. Coupling decentralized key-value stores with erasure coding. In Proc. of ACM SoCC, pages 377--389, 2019.
[19]
A. Cidon, R. Escriva, S. Katti, M. Rosenblum, and E. G. Sirer. Tiered replication: A cost-effective alternative to full cluster geo-replication. In Proc. of USENIX ATC, pages 31--43, 2015.
[20]
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with YCSB. In Proc. of ACM SoCC, pages 143--154, 2010.
[21]
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In Proc. of ACM SOSP, pages 205--220, 2007.
[22]
D. Ford, F. Labelle, F. I. Popovici, M. Stokel, V.-A. Truong, L. Barroso, C. Grimes, and S. Quinlan. Availability in Globally Distributed Storage Systems. In Proc. of USENIX OSDI, Oct 2010.
[23]
Y.-J. Hong and M. Thottethodi. Understanding and mitigating the impact of load imbalance in the memory caching tier. In Proc. of ACM SoCC, page 13, 2013.
[24]
X. Hu, X. Wang, Y. Li, L. Zhou, Y. Luo, C. Ding, S. Jiang, and Z. Wang. LAMA: Optimized locality-aware memory allocation for key-value cache. In Proc. of USENIX ATC, pages 57--69, 2015.
[25]
Y. Hu, L. Cheng, Q. Yao, P. P. C. Lee, W. Wang, and W. Chen. Exploiting combined locality for wide-stripe erasure coding in distributed storage. In Proc. of USENIX FAST, Feb. 2021.
[26]
Y. Hu, X. Li, M. Zhang, P. P. Lee, X. Zhang, P. Zhou, and D. Feng. Optimal repair layering for erasure-coded data centers: From theory to practice. ACM Trans. on Storage, 13(4):33, 2017.
[27]
C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li, and S. Yekhanin. Erasure Coding in Windows Azure Storage. In Proc. of USENIX ATC, Jun 2012.
[28]
J. Huang, X. Liang, X. Qin, P. Xie, and C. Xie. Scale-RS: An efficient scaling scheme for RS-coded storage clusters. IEEE Trans. on Parallel and Distributed Systems, 26(6):1704--1717, 2015.
[29]
Z. Jia, J. Zhan, L. Wang, C. Luo, W. Gao, Y. Jin, R. Han, and L. Zhang. Understanding big data analytics workloads on modern processors. IEEE Trans. on Parallel and Distributed Systems, 28(6):1797--1810, 2017.
[30]
C. Jin, D. Feng, H. Jiang, and L. Tian. Raid6l: A log-assisted raid6 storage architecture with improved write performance. In Proc. of IEEE MSST, pages 1--6. IEEE, 2011.
[31]
D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In Proc. of ACM STOC, pages 654--663, 1997.
[32]
A. Lakshman and P. Malik. Cassandra: A decentralized structured storage system. ACM SIGOPS Operating Systems Review, 44(2):35--40, 2010.
[33]
R. Li, X. Li, P. P. Lee, and Q. Huang. Repair pipelining for erasure-coded storage. In Proc. of USENIX ATC, pages 567--579, 2017.
[34]
S. Li, Q. Zhang, Z. Yang, and Y. Dai. BCStore: Bandwidth-efficient in-memory KV-store with batch coding. In Proc. of IEEE MSST, 2017.
[35]
X. Li, D. G. Andersen, M. Kaminsky, and M. J. Freedman. Algorithmic improvements for fast concurrent cuckoo hashing. In Proc. of ACM EuroSys, page 27, 2014.
[36]
X. Li, R. Li, P. P. Lee, and Y. Hu. Openec: Toward unified and configurable erasure coding management in distributed storage systems. In Proc. of USENIX FAST, pages 331--344, 2019.
[37]
H. Lim, D. Han, D. G. Andersen, and M. Kaminsky. MICA: A holistic approach to fast in-memory key-value storage. In Proc. of USENIX NSDI, pages 429--444, 2014.
[38]
W. Litwin, R. Moussa, and T. Schwarz. LH* RS: A highly-available scalable distributed data structure. ACM Trans. on Database Systems, 30(3):769--811, 2005.
[39]
S. Mitra, R. Panta, M.-R. Ra, and S. Bagchi. Partial-parallel-repair (PPR): a distributed technique for repairing erasure coded storage. In Proc. of ACM Eurosys, page 30. ACM, 2016.
[40]
S. Muralidhar, W. Lloyd, S. Roy, C. Hill, E. Lin, W. Liu, S. Pan, S. Shankar, V. Sivakumar, L. Tang, et al. f4: Facebook's Warm BLOB Storage System. In Proc. of USENIX OSDI, pages 383--398, 2014.
[41]
R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, et al. Scaling Memcache at Facebook. In Proc. of USENIX NSDI, pages 385--398, 2013.
[42]
J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, et al. The case for ramclouds: scalable high-performance storage entirely in dram. ACM SIGOPS Operating Systems Review, 43(4):92--105, 2010.
[43]
M. Ovsiannikov, S. Rus, D. Reeves, P. Sutter, S. Rao, and J. Kelly. The Quantcast File System. Proc. of VLDB Endowment, 6(11):1092--1101, 2013.
[44]
J. S. Plank. Erasure codes for storage systems: A brief primer. The magazine of USENIX & SAGE, 38(6):44--50, 2013.
[45]
R. Power and J. Li. Piccolo: Building fast, distributed programs with partitioned tables. In Proc. of USENIX OSDI, volume 10, pages 293--306, 2010.
[46]
R. Li, Z. Zhang, K. Zheng, and A. Wang. Progress report: Bringing erasure coding to apache hadoop. https://blog.cloudera.com/blog/2016/02/progress-report-bringing-erasure-coding-to-apache-hadoop, 2016.
[47]
P. Raju, R. Kadekodi, V. Chidambaram, and I. Abraham. Pebblesdb: Building key-value stores using fragmented log-structured merge trees. In Proc. of ACM SOSP, pages 497--514, 2017.
[48]
K. Rashmi, M. Chowdhury, J. Kosaian, I. Stoica, and K. Ramchandran. EC-Cache: Load-balanced, low-latency cluster caching with online erasure coding. In Proc. of USENIX OSDI, pages 401--417, 2016.
[49]
I. Reed and G. Solomon. Polynomial Codes over Certain Finite Fields. Journal of the Society for Industrial and Applied Mathematics, 1960.
[50]
M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur. XORing Elephants: Novel Erasure Codes for Big Data. In Proc. of ACM VLDB Endowment, pages 325--336, 2013.
[51]
B. Schroeder and G. A. Gibson. Disk Failures in the Real World: What does an MTTF of 1,000,000 Hours Mean to You? In Proc. of USENIX FAST, page 1, 2007.
[52]
D. Shankar. Designing high-performance, resilient and heterogeneity-aware key-value storage for modern hpc clusters. IEEE SC Doctoral Showcase, 2018.
[53]
Z. Shen and P. P. Lee. Cross-rack-aware updates in erasure-coded data centers. In Proc. of ICPP, pages 1--10, 2018.
[54]
H. Shi and X. Lu. Inec: fast and coherent in-network erasure coding. In Proc. of IEEE SC, pages 924--940. IEEE Computer Society, 2020.
[55]
M. Silberstein, L. Ganesh, Y. Wang, L. Alvisi, and M. Dahlin. Lazy means smart: Reducing repair bandwidth costs in erasure-coded distributed storage. In Proc. of ACM SYSTOR, pages 1--7. ACM, 2014.
[56]
D. Stodolsky, G. Gibson, and M. Holland. Parity logging overcoming the small write problem in redundant disk arrays. ACM SIGARCH Computer Architecture News, 21(2):64--75, 1993.
[57]
K. Taranov, G. Alonso, and T. Hoefler. Fast and strongly-consistent per-item resilience in key-value stores. In Proc. of ACM EuroSys, page 39, 2018.
[58]
J. Yang, Y. Yue, and K. Rashmi. A large scale analysis of hundreds of in-memory cache clusters at twitter. In Proc. of USENIX OSDI, pages 191--208, 2020.
[59]
M. M. Yiu, H. H. Chan, and P. P. Lee. Erasure coding for small objects in in-memory KV storage. In Proc. of ACM SYSTOR, page 14, 2017.
[60]
Y. Yu, R. Huang, W. Wang, J. Zhang, and K. B. Letaief. Sp-cache: Load-balanced, redundancy-free cluster caching with selective partition. In Proc. of IEEE SC, pages 1--13. IEEE, 2018.
[61]
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauly, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proc. of USENIX NSDI, pages 15--28, 2012.
[62]
H. Zhang, M. Dong, and H. Chen. Efficient and available in-memory kv-store with hybrid erasure coding and replication. In Proc. of USENIX FAST, pages 167--180, 2016.
[63]
X. Zhang, Y. Hu, P. P. Lee, and P. Zhou. Toward optimal storage scaling via network coding: From theory to practice. In Proc. of IEEE INFOCOM, pages 1808--1816, 2018.

Cited By

View all
  • (2024)CoRD: Combining Raid and Delta for Fast Partial Updates in Erasure-Coded Storage ClustersProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00113(1-14)Online publication date: 17-Nov-2024
  • (2024)Advanced Elastic Reed–Solomon Codes for Erasure-Coded Key–Value StoresIEEE Internet of Things Journal10.1109/JIOT.2023.329957411:3(4747-4762)Online publication date: 1-Feb-2024
  • (2023)Design Considerations and Analysis of Multi-Level Erasure Coding in Large-Scale Data CentersProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607072(1-13)Online publication date: 12-Nov-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
November 2021
1493 pages
ISBN:9781450384421
DOI:10.1145/3458817
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 November 2021

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. erasure coding
  2. key-value stores
  3. parity logging
  4. update

Qualifiers

  • Research-article

Conference

SC '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)96
  • Downloads (Last 6 weeks)11
Reflects downloads up to 10 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)CoRD: Combining Raid and Delta for Fast Partial Updates in Erasure-Coded Storage ClustersProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00113(1-14)Online publication date: 17-Nov-2024
  • (2024)Advanced Elastic Reed–Solomon Codes for Erasure-Coded Key–Value StoresIEEE Internet of Things Journal10.1109/JIOT.2023.329957411:3(4747-4762)Online publication date: 1-Feb-2024
  • (2023)Design Considerations and Analysis of Multi-Level Erasure Coding in Large-Scale Data CentersProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607072(1-13)Online publication date: 12-Nov-2023
  • (2023)Towards Survivable In-Memory Stores with Parity Coded NVRAM2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00135(956-963)Online publication date: 1-Nov-2023
  • (2023)MDTUpdate: A Multi-Block Double Tree Update Technique in Heterogeneous Erasure-Coded ClustersIEEE Transactions on Computers10.1109/TC.2023.327106472:10(2808-2821)Online publication date: 1-Oct-2023
  • (2023)Boosting Multi-Block Repair in Cloud Storage Systems with Wide-Stripe Erasure Coding2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00036(279-289)Online publication date: May-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media