[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

S3: a scalable in-memory skip-list index for key-value store

Published: 01 August 2019 Publication History

Abstract

Many new memory indexing structures have been proposed and outperform current in-memory skip-list index adopted by LevelDB, RocksDB and other key-value systems. However, those new indexes cannot be easily intergrated with key-value systems, because most of them do not consider how the data can be efficiently flushed to disk. Some assumptions, such as fixed size key and value, are unrealistic for real applications. In this paper, we present S3, a scalable in-memory skip-list index for the customized version of RocksDB in Alibaba Cloud. S3 adopts a two-layer structure. In the top layer, a cache-sensitive structure is used to maintain a few guard entries to facilitate the search over the skip-list. In the bottom layer, a semi-ordered skip-list index is built to support highly concurrent insertions and fast lookup and range query. To further improve the performance, we train a neural model to select guard entries intelligently according to the data distribution and query distribution. Experiments on multiple datasets show that S3 achieves a comparable performance to other new memory indexing schemes, and can replace current in-memory skip-list of LevelDB and RocksDB to support huge volume of data.

References

[1]
Apache hbase. http://hbase.apache.org/.
[2]
Bw-tree. https://github.com/wangziqi2013/bwtree.
[3]
Cicada. https://github.com/efficient/cicada-engine.
[4]
Clht. https://github.com/lpd-epfl/clht.
[5]
Cockroachdb. https://github.com/cockroachdb/cockroach.
[6]
Hyperleveldb. https://github.com/rescrv/hyperleveldb.
[7]
Judy arrays. http://judy.sourceforge.net/.
[8]
Leveldb. http://ccrma.stanford.edu/jos/bayes/bayes.html.
[9]
Masstree. https://github.com/kohler/masstree-beta.
[10]
Mongodb. https://www.mongodb.com.
[11]
Peloton. https://github.com/cmu-db/peloton.
[12]
Redis. https://redis.io/.
[13]
Rocksdb. http://rocksdb.org.
[14]
Search results apache flink: Scalable stream and batch data processing. https://flink.apache.org.
[15]
Yahoo! cloud serving benchmark in c++. https://github.com/basicthinker/ycsb-c.
[16]
V. Alvarez, S. Richter, X. Chen, and J. Dittrich. A comparison of adaptive radix trees and hash tables. In ICDE, pages 1227--1238, 2015.
[17]
O. Balmau, R. Guerraoui, V. Trigonakis, and I. Zablotchi. Flodb: Unlocking memory in persistent key-value stores. In EuroSys, pages 80--94, 2017.
[18]
S. Chen, P. B. Gibbons, and T. C. Mowry. Improving index performance through prefetching. In SIGMOD, pages 235--246, 2001.
[19]
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with YCSB. In SoCC, pages 143--154, 2010.
[20]
T. David, R. Guerraoui, and V. Trigonakis. Asynchronized concurrency: The secret to scaling concurrent search data structures. In ASPLOS, pages 631--644, 2015.
[21]
G. Golan-Gueta, E. Bortnikov, E. Hillel, and I. Keidar. Scaling concurrent log-structured data stores. In EuroSys, pages 32:1--32:14, 2015.
[22]
R. A. Hankins and J. M. Patel. Effect of node size on the performance of cache-conscious b<sup>+</sup>-trees. In SIGMETRICS, pages 283--294, 2003.
[23]
S. Hochreiter. The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 6(2):107--116, 1998.
[24]
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735--1780, 1997.
[25]
C. Kim, J. Chhugani, N. Satish, E. Sedlar, A. D. Nguyen, T. Kaldewey, V. W. Lee, S. A. Brandt, and P. Dubey. FAST: fast architecture sensitive tree search on modern cpus and gpus. In SIGMOD, pages 339--350, 2010.
[26]
V. Leis, A. Kemper, and T. Neumann. The adaptive radix tree: Artful indexing for main-memory databases. In ICDE, pages 38--49, 2013.
[27]
J. J. Levandoski, D. B. Lomet, and S. Sengupta. The bw-tree: A b-tree for new hardware platforms. In ICDE, pages 302--313, 2013.
[28]
H. Lim, M. Kaminsky, and D. G. Andersen. Cicada: Dependably fast multi-core in-memory transactions. In SIGMOD, pages 21--35, 2017.
[29]
L. Lu, T. S. Pillai, H. Gopalakrishnan, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Wisckey: Separating keys from values in ssd-conscious storage. TOS, 13(1):5:1--5:28, 2017.
[30]
Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. In EuroSys, pages 183--196, 2012.
[31]
P. E. O'Neil, E. Cheng, D. Gawlick, and E. J. O'Neil. The log-structured merge-tree (lsm-tree). Acta Inf., 33(4):351--385, 1996.
[32]
W. Pugh. Skip lists: A probabilistic alternative to balanced trees. Commun. ACM, 33(6):668--676, 1990.
[33]
P. Raju, R. Kadekodi, V. Chidambaram, and I. Abraham. Pebblesdb: Building key-value stores using fragmented log-structured merge trees. In SOSP, pages 497--514, 2017.
[34]
J. Rao and K. A. Ross. Cache conscious indexing for decision-support in main memory. In VLDB, pages 78--89, 1999.
[35]
J. Rao and K. A. Ross. Making b+-trees cache conscious in main memory. In SIGMOD, pages 475--486, 2000.
[36]
J. Sewall, J. Chhugani, C. Kim, N. Satish, and P. Dubey. PALM: parallel architecture-friendly latch-free modifications to B+ trees on many-core processors. PVLDB, 4(11):795--806, 2011.
[37]
S. Sprenger, S. Zeuch, and U. Leser. Cache-sensitive skip list: Efficient range queries on modern cpus. In IMDM, pages 1--17, 2016.
[38]
I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. CoRR, abs/1409.3215, 2014.
[39]
G. Wu, X. He, and B. Eckart. An adaptive write buffer management scheme for flash-based ssds. TOS, 8(1):1:1--1:24, 2012.
[40]
X. Wu, Y. Xu, Z. Shao, and S. Jiang. Lsm-trie: An lsm-tree-based ultra-large key-value store for small data items. In USENIX, pages 71--82, 2015.
[41]
Z. Xie, Q. Cai, G. Chen, R. Mao, and M. Zhang. A comprehensive performance evaluation of modern in-memory indices. In ICDE, pages 641--652, 2018.
[42]
Z. Xie, Q. Cai, H. V. Jagadish, B. C. Ooi, and W. Wong. PI : a parallel in-memory skip list based index. CoRR, abs/1601.00159, 2016.
[43]
Z. Xie, Q. Cai, H. V. Jagadish, B. C. Ooi, and W. Wong. Parallelizing skip lists for in-memory multi-core database systems. In ICDE, pages 119--122, 2017.

Cited By

View all
  • (2024)Demonstration of Ver: View Discovery in the WildCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654748(428-431)Online publication date: 9-Jun-2024
  • (2024)A survey on hybrid transactional and analytical processingThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-024-00858-933:5(1485-1515)Online publication date: 1-Sep-2024
  • (2023)BP-Tree: Overcoming the Point-Range Operation Tradeoff for In-Memory B-TreesProceedings of the VLDB Endowment10.14778/3611479.361150216:11(2976-2989)Online publication date: 24-Aug-2023
  • Show More Cited By
  1. S3: a scalable in-memory skip-list index for key-value store

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 12, Issue 12
    August 2019
    547 pages

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 August 2019
    Published in PVLDB Volume 12, Issue 12

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)130
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 12 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Demonstration of Ver: View Discovery in the WildCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654748(428-431)Online publication date: 9-Jun-2024
    • (2024)A survey on hybrid transactional and analytical processingThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-024-00858-933:5(1485-1515)Online publication date: 1-Sep-2024
    • (2023)BP-Tree: Overcoming the Point-Range Operation Tradeoff for In-Memory B-TreesProceedings of the VLDB Endowment10.14778/3611479.361150216:11(2976-2989)Online publication date: 24-Aug-2023
    • (2022)A design space for RDF data representationsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00725-x31:2(347-373)Online publication date: 21-Jan-2022
    • (2021)Opportunities and Challenges in Code Search ToolsACM Computing Surveys10.1145/348002754:9(1-40)Online publication date: 8-Oct-2021
    • (2020)Solving the Fragment Complexity of Official, Social, and Sensorial Urban DataComplexity10.1155/2020/89147572020Online publication date: 1-Jan-2020
    • (2020)JellyFishProceedings of the 21st International Middleware Conference10.1145/3423211.3425672(134-148)Online publication date: 7-Dec-2020

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media