[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3342195.3387523acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

EvenDB: optimizing key-value storage for spatial locality

Published: 17 April 2020 Publication History

Abstract

Applications of key-value (KV-)storage often exhibit high spatial locality, such as when many data items have identical composite key prefixes. This prevalent access pattern is underused by the ubiquitous LSM design underlying high-throughput KV-stores today.
We present EvenDB, a general-purpose persistent KV-store optimized for spatially-local workloads. EvenDB combines spatial data partitioning with LSM-like batch I/O. It achieves high throughput, ensures consistency under multi-threaded access, and reduces write amplification.
In experiments with real-world data from a large analytics platform, EvenDB outperforms the state-of-the-art. E.g., on a 256GB production dataset, EvenDB ingests data 4.4X faster than RocksDB and reduces write amplification by nearly 4X. In traditional YCSB workloads lacking spatial locality, EvenDB is on par with RocksDB and significantly better than other open-source solutions we explored.

References

[1]
https://github.com/facebook/rocksdb/wiki/Setup-Options-and-Basic-Tuning#block-cache-size.
[2]
Appsflyer. https://appsflyer.com.
[3]
Flurry analytics. https://flurry.com.
[4]
Flurry state of the mobile 2018. https://www.verizonmedia.com/insights/flurry-analytics-releases-2017-state-of-mobile-report.
[5]
Google firebase. https://firebase.google.com.
[6]
Ignite database and caching platform. https://ignite.apache.org/.
[7]
Innodblocking. https://dev.mysql.com/doc/refman/8.0/en/innodb-locking.html.
[8]
NoSQL market is expected to reach $4.2 billion, globally, by 2020. https://www.alliedmarketresearch.com/press-release/NoSQL-market-is-expected-to-reach-4-2-billion-globally-by-2020-allied-market-research.html.
[9]
Redis, an open source, in-memory data structure store. https://redis.io/.
[10]
RocksDB performance benchmarks. https://github.com/facebook/rocksdb/wiki/performance-benchmarks.
[11]
RocksDB tuning guide. https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guidel.
[12]
Apache hbase, a distributed, scalable, big data store. http://hbase.apache.org/, Apr. 2014.
[13]
A fast and lightweight key/value database library by google. http://code.google.com/p/leveldb, Jan. 2014.
[14]
A persistent key-value store for fast storage environments. http://rocksdb.org/, June 2014.
[15]
Yahoo! Cloud Serving Benchmark in C++, a C++ version of YCSB. https://github.com/basicthinker/YCSB-C, 2014.
[16]
Memcached, an open source, high-performance, distributed memory object caching system. https://memcached.org/, Dec. 2018.
[17]
Percona TokuDB. https://www.percona.com/software/mysql-database/percona-tokudb, 2018.
[18]
Scylla the real-time big data database. https://www.scylladb.com/, 2018.
[19]
Versionset::removefilelevelbloomfilterinfo isn't thread-safe. https://github.com/utsaslab/pebblesdb/issues/19, January 2018.
[20]
PerconaFT is a high-performance, transactional key-value store. https://github.com/percona/PerconaFT, 2019.
[21]
Armstrong, T. G., Ponnekanti, V., Borthakur, D., and Callaghan, M. Linkbench: A database benchmark based on the facebook social graph. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (New York, NY, USA, 2013), SIGMOD '13, ACM, pp. 1185--1196.
[22]
Balmau, O., Didona, D., Guerraoui, R., Zwaenepoel, W., Yuan, H., Arora, A., Gupta, K., and Konka, P. Triad: Creating synergies between memory, disk and log in log structured key-value stores. In Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference (2017), USENIX ATC '17, pp. 363--375.
[23]
Basin, D., Bortnikov, E., Braginsky, A., Golan-Gueta, G., Hillel, E., Keidar, I., and Sulamy, M. Kiwi: A key-value map for scalable real-time analytics. In Proceedings of the 22Nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (2017), PPoPP '17, ACM, pp. 357--369.
[24]
Borthakur, D., Gray, J., Sarma, J. S., Muthukkaruppan, K., Spiegelberg, N., Kuang, H., Ranganathan, K., Molkov, D., Menon, A., Rash, S., Schmidt, R., and Aiyer, A. Apache hadoop goes realtime at facebook. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (2011), SIGMOD '11, pp. 1071--1080.
[25]
Bortnikov, E., Braginsky, A., Hillel, E., Keidar, I., and Sheffi, G. Accordion: Better memory organization for lsm key-value stores. Proc. VLDB Endow. 11, 12 (Aug. 2018), 1863--1875.
[26]
Brodal, G. S., and Fagerberg, R. Lower bounds for external memory dictionaries. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (2003), SODA '03, pp. 546--554.
[27]
Callaghan, M. Name that compaction algorithm. https://smalldatum.blogspot.com/2018/08/name-that-compaction-algorithm.html, 2018.
[28]
Cao, Z., Dong, S., Vemuri, S., and Du, D. H. Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. In USENIX Conference on File and Storage Technologies (FAST) (2020).
[29]
Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., and Gruber, R. E. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26, 2 (June 2008), 4:1--4:26.
[30]
Cho, J., Garcia-Molina, H., and Page, L. Efficient crawling through url ordering. In Proceedings of the Seventh International Conference on World Wide Web 7 (1998), WWW7, pp. 161--172.
[31]
Cooper, B. F., Silberstein, A., Tam, E., Ramakrishnan, R., and Sears, R. Benchmarking cloud serving systems with ycsb. In Proceedings of the 1st ACM Symposium on Cloud Computing (2010), SoCC '10, pp. 143--154.
[32]
Dong, S., Callaghan, M., Galanis, L., Borthakur, D., Savor, T., and Strum, M. Optimizing space amplification in rocksdb. In CIDR 2017, 8th Biennial Conference on Innovative Data Systems Research, Chaminade, CA, USA, January 8--11, 2017, Online Proceedings (2017), www.cidrdb.org.
[33]
Einziger, G., Friedman, R., and Manes, B. Tinylfu: A highly efficient cache admission policy. ACM Trans. Storage 13, 4 (Nov. 2017), 35:1--35:31.
[34]
Golan-Gueta, G., Bortnikov, E., Hillel, E., and Keidar, I. Scaling concurrent log-structured data stores. In EuroSys (2015), pp. 32:1--32:14.
[35]
Iyer, S. Comparing tokudb, rocksdb and inn-odb performance on intel(r) xeon(r) gold 6140 cpu. https://minervadb.com/index.php/2018/08/06/comparing-tokudb-rocksdb-and-innodb-performance-on-intelr-xeonr-gold-6140--cpu/, 2018.
[36]
Kaiyrakhmet, O., Lee, S., Nam, B., Noh, S. H., and ri Choi, Y. SLMDB: Single-level key-value store with persistent memory. In 17th USENIX Conference on File and Storage Technologies (FAST 19) (2019), pp. 191--205.
[37]
Knuth, D. E. The Art of Computer Programming, Volume 3: (2Nd Ed.) Sorting and Searching. Addison Wesley Longman Publishing Co., Inc., Redwood City, CA, USA, 1998.
[38]
Lakshman, A., and Malik, P. Cassandra: A decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44, 2 (Apr. 2010), 35--40.
[39]
Lu, L., Pillai, T. S., Gopalakrishnan, H., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. Wisckey: Separating keys from values in ssd-conscious storage. ACM Trans. Storage 13, 1 (Mar. 2017), 5:1--5:28.
[40]
Matsunobu, Y. Myrocks: A space- and write-optimized mysql database. https://engineering.fb.com/core-data/myrocks-a-space-and-write-optimized-mysql-database/, 2016.
[41]
Muth, P., O'Neil, P. E., Pick, A., and Weikum, G. Design, implementation, and performance of the lham log-structured history data access method. In Proceedings of the 24rd International Conference on Very Large Data Bases (1998), VLDB '98, pp. 452--463.
[42]
O'Neil, P. E., Cheng, E., Gawlick, D., and O'Neil, E. J. The log-structured merge-tree (lsm-tree). Acta Inf. 33, 4 (1996), 351--385.
[43]
Papagiannis, A., Saloustros, G., González-Férez, P., and Bilas, A. Tucana: Design and implementation of a fast and efficient scale-up key-value store. In Proceedings of the 2016 USENIX Conference on Usenix Annual Technical Conference (2016), USENIX ATC '16, pp. 537--550.
[44]
Papagiannis, A., Saloustros, G., González-Férez, P., and Bilas, A. An efficient memory-mapped key-value store for flash storage. In Proceedings of the ACM Symposium on Cloud Computing (New York, NY, USA, 2018), SoCC '18, ACM, pp. 490--502.
[45]
Raju, P., Kadekodi, R., Chidambaram, V., and Abraham, I. Pebblesdb: Building key-value stores using fragmented log-structured merge trees. In Proceedings of the 26th Symposium on Operating Systems Principles (2017), SOSP '17, pp. 497--514.
[46]
Shetty, P. J., Spillane, R. P., Malpani, R. R., Andrews, B., Seyster, J., and Zadok, E. Building workload-independent storage with vt-trees. In Presented as part of the 11th USENIX Conference on File and Storage Technologies (FAST 13) (2013), pp. 17--30.
[47]
Spiegelman, A., Golan-Gueta, G., and Keidar, I. Transactional data structure libraries. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (2016), PLDI '16, pp. 682--696.
[48]
Srinivasan, V., Bulkowski, B., Chu, W.-L., Sayyaparaju, S., Gooding, A., Iyer, R., Shinde, A., and Lopatic, T. Aerospike: Architecture of a real-time operational dbms. Proc. VLDB Endow. 9, 13 (Sept. 2016), 1389--1400.
[49]
Tribble, P. How to Ruin Your Performance by Choosing the Wrong Compaction Strategy. https://www.scylladb.com/2017/12/28/compaction-strategy-scylla/, 2017.
[50]
Wu, X., Ni, F., and Jiang, S. Wormhole: A fast ordered index for in-memory data management. In Proceedings of the Fourteenth EuroSys Conference 2019 (New York, NY, USA, 2019), EuroSys '19, ACM, pp. 18:1--18:16.

Cited By

View all
  • (2024)SolsDBFuture Generation Computer Systems10.1016/j.future.2024.05.050160:C(295-304)Online publication date: 1-Nov-2024
  • (2024)EKRM: Efficient Key-Value Retrieval Method to Reduce Data Lookup Overhead for RedisEuro-Par 2024: Parallel Processing10.1007/978-3-031-69577-3_12(166-179)Online publication date: 26-Aug-2024
  • (2023)Asgard: Are NoSQL databases suitable for ephemeral data in serverless workloads?Frontiers in High Performance Computing10.3389/fhpcp.2023.11278831Online publication date: 4-Sep-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroSys '20: Proceedings of the Fifteenth European Conference on Computer Systems
April 2020
49 pages
ISBN:9781450368827
DOI:10.1145/3342195
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2020

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

  • Hasso Platner Institute

Conference

EuroSys '20
Sponsor:
EuroSys '20: Fifteenth EuroSys Conference 2020
April 27 - 30, 2020
Heraklion, Greece

Acceptance Rates

EuroSys '20 Paper Acceptance Rate 43 of 234 submissions, 18%;
Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)126
  • Downloads (Last 6 weeks)16
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)SolsDBFuture Generation Computer Systems10.1016/j.future.2024.05.050160:C(295-304)Online publication date: 1-Nov-2024
  • (2024)EKRM: Efficient Key-Value Retrieval Method to Reduce Data Lookup Overhead for RedisEuro-Par 2024: Parallel Processing10.1007/978-3-031-69577-3_12(166-179)Online publication date: 26-Aug-2024
  • (2023)Asgard: Are NoSQL databases suitable for ephemeral data in serverless workloads?Frontiers in High Performance Computing10.3389/fhpcp.2023.11278831Online publication date: 4-Sep-2023
  • (2023)Efficient Compactions between Storage Tiers with PrismDBProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582052(179-193)Online publication date: 25-Mar-2023
  • (2023)FlatLSM: Write-Optimized LSM-Tree for PM-Based KV StoresACM Transactions on Storage10.1145/357985519:2(1-26)Online publication date: 6-Mar-2023
  • (2023)The Locality of Memory CheckingProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623195(1820-1834)Online publication date: 15-Nov-2023
  • (2023)The Design and Implementation of UniKV for Mixed Key-Value Storage WorkloadsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.323451035:11(11935-11949)Online publication date: 1-Nov-2023
  • (2023)PMDB: A Range-Based Key-Value Store on Hybrid NVM-Storage SystemsIEEE Transactions on Computers10.1109/TC.2022.320275572:5(1274-1285)Online publication date: 1-May-2023
  • (2022)Improving Concurrent GC for Latency Critical Services in Multi-tenant SystemsProceedings of the 23rd ACM/IFIP International Middleware Conference10.1145/3528535.3531515(43-55)Online publication date: 7-Nov-2022
  • (2022)HolmesProceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing10.1145/3502181.3531464(110-121)Online publication date: 27-Jun-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media