[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

PA-LBF: : Prefix-Based and Adaptive Learned Bloom Filter for Spatial Data

Published: 01 January 2023 Publication History

Abstract

The recently proposed learned bloom filter (LBF) opens a new perspective on how to reconstruct bloom filters with machine learning. However, the LBF has a massive time cost and does not apply to multidimensional spatial data. In this paper, we propose a prefix-based and adaptive learned bloom filter (PA-LBF) for spatial data, which efficiently supports the insertion and deletion. The proposed PA-LBF is divided into three parts: (1) the prefix-based classification. The Z-order space-filling curve is used to extract data, prefix it, and classify it. (2) The adaptive learning process. The multiple independent adaptive sub-LBFs are designed to train the suffixes of data, combined with part 1, to reduce the false positive rate (FPR), query, and learning process time consumption. (3) The backup filter uses CBF. Two kinds of backup CBF are constructed to meet the situation of different insertion and deletion frequencies. Experimental results prove the validity of the theory and show that the PA-LBF reduces the FPR by 84.87%, 79.53%, and 43.01% with the same memory usage compared with the LBF on three real-world spatial datasets. Moreover, the time consumption of PA-LBF can be reduced to and 2.05× that of the LBF on the query and learning process, respectively.

References

[1]
B. H. Bloom, “Space/time trade-offs in hash coding with allowable errors,” Communications of the ACM, vol. 13, no. 7, pp. 422–426, 1970.
[2]
J. M. Medina, I. J. Blanco, and O. Pons, “A fuzzy database engine for mongoDB,” International Journal of Intelligent Systems, vol. 37, no. 9, pp. 5691–5724, 2022.
[3]
J. H. Mun and H. Lim, “New approach for efficient ip address lookup using a bloom filter in trie-based algorithms,” IEEE Transactions on Computers, vol. 65, no. 5, pp. 1558–1565, 2016.
[4]
Z. Dai and A. Shrivastava, “Adaptive learned bloom filter (ada-bf): efficient utilization of the classifier with application to real-time information filtering on the web,” Advances in Neural Information Processing Systems, vol. 33, pp. 11700–11710, 2020.
[5]
R. Patgiri, A. Biswas, and S. Nayak, “deepbf: malicious url detection using learned bloom filter and evolutionary deep learning,” 2021, https://arxiv.org/abs/2103.12544arXiv preprint arXiv:2103.12544.
[6]
P. Reviriego, J. A. Hernández, Z. Dai, and A. Shrivastava, “Learned bloom filters in adversarial environments: a malicious URL detection use-case,” in Proceedings of the IEEE. 2021 IEEE 22nd International Conference on High Performance Switching and Routing (HPSR), pp. 1–6, Paris, France, June 2021.
[7]
J. Lee, H. Byun, and H. Lim, “Dual-load Bloom filter: application for name lookup,” Computer Communications, vol. 151, pp. 1–9, 2020.
[8]
Q. Wu, Q. Wang, M. Zhang, R. Zheng, J. Zhu, and J. Hu, “Learned bloom-filter for the efficient name lookup in information-centric networking,” Journal of Network and Computer Applications, vol. 186, p. 103077, 2021.
[9]
Y. Wu, J. He, S. Yan, J. Wu, T. Yang, O. Ruas, G. Zhang, and B. Cui, “Elastic bloom filter: deletable and expandable filter using elastic fingerprints,” IEEE Transactions on Computers, vol. 71, no. 4, pp. 984–991, 2022.
[10]
R. Xie, M. Li, Z. Miao et al., “Hash adaptive bloom filter,” in Proceedings of the IEEE. 2021 IEEE 37th International Conference on Data Engineering (ICDE), pp. 636–647, Chania, Greece, April 2021.
[11]
L. Fan, P. Cao, J. Almeida, and A. Z. Broder, “Summary cache: a scalable wide-area web cache sharing protocol,” IEEE/ACM Transactions on Networking, vol. 8, no. 3, pp. 281–293, 2000.
[12]
S. Nayak and R. Patgiri, “countBF: a general-purpose high accuracy and space efficient counting bloom filter,” in Proceedings of the IEEE. 2021 17th International Conference on Network and Service Management (CNSM), pp. 355–359, Rio de Janeiro, Brazil, November 2021.
[13]
B. Fan, D. G. Andersen, M. Kaminsky, and M. D. Mitzenmacher, “Cuckoo filter: practically better than bloom,” in Proceedings of the 10th ACM International on Conference on emerging Networking Experiments and Technologies, pp. 75–88, Sydney Australia, December 2014.
[14]
K. Xie, Y. Min, D. Zhang, J. Wen, and G. Xie, “A scalable bloom filter for membership queries,” in Proceedings of the IEEE. IEEE GLOBECOM 2007-IEEE Global Telecommunications Conference, pp. 543–547, Washington, DC, USA, November 2007.
[15]
D. Guo, J. Wu, H. Chen, and X. Luo, “Theory and network applications of dynamic bloom filters,” in Proceedings of the IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications, pp. 1–12, Barcelona, Spain, April 2006.
[16]
D. Guo, J. Wu, H. Chen, Y. Yuan, and X. Luo, “The dynamic bloom filters,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 1, pp. 120–133, 2010.
[17]
A. Crainiceanu and D. Lemire, “Bloofi: multidimensional bloom filters,” Information Systems, vol. 54, pp. 311–324, 2015.
[18]
R. Patgiri, S. Nayak, and S. K. Borgohain, “rDBF: a r-dimensional bloom filter for massive scale membership query,” Journal of Network and Computer Applications, vol. 136, pp. 100–113, 2019.
[19]
Y. Fu, R. Du, H. Hu, M. H. Au, and D. Li, “Matrix bloom filter: an efficient probabilistic data structure for 2-tuple batch lookup,” 2019, https://arxiv.org/abs/1912.07153arXiv preprint arXiv:1912.07153.
[20]
T. Kraska, A. Beutel, E. H. Chi, J. Dean, and N. Polyzotis, “The case for learned index structures,” in Proceedings of the 2018 international conference on management of data, pp. 489–504, Houston TX USA, June 2018.
[21]
M. Mitzenmacher, “A model for learned bloom filters and optimizing by sandwiching,” Advances in Neural Information Processing Systems, vol. 31, 2018.
[22]
J. Rae, S. Bartunov, and T. Lillicrap, “Meta-learning neural bloom filters,” in Proceedings of the PMLR. International Conference on Machine Learning, pp. 5271–5280, Long Beach, CA, USA, June 2019.
[23]
K. Vaidya, E. Knorr, M. Mitzenmacher, and T. Kraska, “Partitioned learned bloom filters,” in Proceedings of the International Conference on Learning Representations, Austria, April 2021.
[24]
A. Bhattacharya, C. Gudesa, A. Bagchi, and S. Bedathur, “New wine in an old bottle: data-aware hash functions for bloom filters,” Proceedings of the VLDB Endowment, vol. 15, no. 9, pp. 1924–1936, 2022.
[25]
S. Macke, A. Beutel, T. Kraska, M. Sathiamoorthy, D. Z. Cheng, and E. H. Chi, “Lifting the curse of multidimensional data with learned existence indexes,” Workshop on ML for Systems at NeurIPS, vol. 1–6, 2018.
[26]
A. Davitkova, D. Gjurovski, and S. Michel, “Compressing (multidimensional) learned bloom filters,” Workshop on Databases and AI, 2022, https://arxiv.org/abs/2208.03029arXiv preprint arXiv:1912.07153.
[27]
Q. Liu, L. Zheng, Y. Shen, and L. Chen, “Stable learned bloom filters for data streams,” Proceedings of the VLDB Endowment, vol. 13, no. 12, pp. 2355–2367, 2020.
[28]
A. Bhattacharya, S. Bedathur, and A. Bagchi, “Adaptive learned bloom filters under incremental workloads,” in Proceedings of the 7th ACM IKDD CoDS and 25th COMAD, pp. 107–115, Hyderabad India, January 2020.
[29]
H. Byun and H. Lim, “Learned FBF: learning-based functional bloom filter for key-value storage,” IEEE Transactions on Computers, vol. 71, p. 1, 2021.
[30]
G. Ke, Q. Meng, T. Finley et al., “Lightgbm: a highly efficient gradient boosting decision tree,” Advances in Neural Information Processing Systems, vol. 30, 2017.
[31]
S. Li, W. Li, C. Cook, C. Zhu, and Y. Gao, “Independently recurrent neural network (indrnn): building a longer and deeper rnn,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5457–5466, San Juan, PR, USA, June 2018.
[32]
Y. Ding, Z. Ma, S. Wen, J. Xie, D. Chang, Z. Si, M. Wu, and H Ling, “AP-CNN: weakly supervised attention pyramid convolutional neural network for fine-grained visual classification,” IEEE Transactions on Image Processing, vol. 30, pp. 2826–2836, 2021.
[33]
H. Sagan, Space-filling Curves, Springer Science & Business Media, Berlin, Germany, 2012.
[34]
F. Rams ak, V. Markl, R. Fenk, M. Zirkel, K. Elhardt, and R. Bayer, “Integrating the UB-tree into a database system kernel,” in Proceedings of the 2000. Citeseer. VLDB, pp. 263–272, Cairo, Egypt, September 2000.
[35]
B. Zou, M. Zeng, C. Zhu, L. Xiao, and Z. Chen, “A learned prefix bloom filter for spatial data,” in Proceedings of the Database and Expert Systems Applications, pp. 336–350, Springer International Publishing, Vienna, Austria, August 2022.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of Intelligent Systems
International Journal of Intelligent Systems  Volume 2023, Issue
2023
3189 pages
This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Publisher

John Wiley and Sons Ltd.

United Kingdom

Publication History

Published: 01 January 2023

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media