[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-642-37015-1_52guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A fast indexing algorithm optimization with user behavior pattern

Published: 28 November 2012 Publication History

Abstract

Internet users' access pattern for objects has been observed to follow Zipf's law. The preference for network resource is showing strong influence on real-time lookup performance in large-scale distributed systems. In order to guarantee search response rate with limited memory space, we develop a new object indexing and locating algorithm called Bloom filter Arrays based on Zipf's-distributed user Preference (ZPBA). The algorithm uses a compact data structure to achieve high accuracy in item lookup. We give the theoretical analysis of ZPBA and then conduct experiments with one million item corpus and 100,000 queries to validate our design. Comparison shows that our solution can be 77% more space efficient than traditional bloom filter based index approaches for applications of concentrated user access preference. The algorithm demonstrates practical application potential in fault tolerant large-scale distributed indexing and item lookup.

References

[1]
Chierichetti, F., Kumar, R., Raghavan, P.: Compressed Web Indexes. In: 18th International Conference on World Wide Web, pp. 451-460. Association for Computing Machinery, New York (2009)
[2]
Sato, I., Nakagawa, H.: Topic Models with Power-law using Pitman-Yor Process. In: 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 673-682. Association for Computing Machinery, New York (2010)
[3]
Zipf, G. K.: The Psychobiology of Language. Houghton-Mifflin, Boston (1935)
[4]
Breslau, L., Pei, C., Li, F., Phillips, G., Shenker, S.: Web Caching and Zipf-like Distributions: Evidence and Implications. In: IEEE Annual Joint Conference of the IEEE Computer and Communications Societies, pp. 126-134. IEEE Press, New York (1999)
[5]
Rodriguez, P., Spanner, C., Biersack, E. W.: Analysis of Web Caching Architectures: Hierarchical and Distributed Caching. IEEE/ACM Transactions on Networking 9(4), 404-418 (2001)
[6]
Kotera, I., Egawa, R., Takizawa, H., Kobayashi, H.: Modeling of Cache Access Behavior Based on Zipf's Law. In: 9th Workshop on Memory Performance: Dealing with Applications, Systems and Architecture, pp. 9-15. Association for Computing Machinery, New York (2008)
[7]
Jelenkovic, P. R., Kang, X.: Characterizing the Miss Sequence of the LRU Cache. ACM SIGMETRICS Performance Evaluation Review 36(2), 119-121 (2008)
[8]
Zorich, V. A., Cooke, R.: Mathematical Analysis. Springer, Berlin (2004)
[9]
Ozsu, M. T., Valduriez, P.: Principles of Distributed Database Systems, 3rd edn. Springer, Berlin (2011)
[10]
Stoica, I., Morris, R., Karger, D., Kaashoek, M., Balakrishnan, H.: Chord: A Scalable peer to-peer Lookup Service for Internet Applications. In: 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 149-160. Association for Computing Machinery, New York (2001)
[11]
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A Scalable Content-Addressable Network. In: 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 161-172. Association for Computing Machinery, New York (2001)
[12]
Weil, S. A., Pollack, K. T., Brandt, S. A., Miller, E. L.: Dynamic Metadata Management for Petabyte-Scale File Systems. In: 2004 ACM/IEEE Conference on Supercomputing, p. 4. IEEE Computer Society, Washington, DC (2004)
[13]
Bloom, B.: Space/time Trade-offs in Hash Coding with Allowable Errors. Communications of the ACM 13(7), 422-426 (1970)
[14]
Tarkoma, S.C., Rothenberg, E., Lagerspetz, E.: Theory and Practice of Bloom Filters for Distributed Systems. IEEE Communications Surveys & Tutorials 14(1), 131-155 (2004)
[15]
Wang, Z., Luo, T.-J.: Intelligent Video Content Routing in a Direct Access Network. In: 3rd Symposium on Web Society, pp. 147-152. IEEE Press, New York (2011)
[16]
Fan, L., Cao, P., Almeida, J., Broder, A. Z.: Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. IEEE/ACM Transactions on Networking 8(3), 281-293 (2000)
[17]
Bonomi, F., Mitzenmacher, M., Panigrahy, R., Singh, S., Varghese, G.: An Improved Construction for Counting Bloom Filters. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 684-695. Springer, Heidelberg (2006)
[18]
Ficara, D., Giordano, S., Procissi, G., Vitucci, F.: Multilayer Compressed Counting Bloom Filters. In: 27th Annual Joint Conference of the IEEE Computer and Communications Societies, pp. 311-315. IEEE Press, New York (2008)
[19]
Song, H., Dharmapurikar, S., Turner, J., Lockwood, J.: Fast Hash Table Lookup using Extended Bloom Filter: an Aid to Network Processing. In: 2005 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 181-192. Association for Computing Machinery, New York (2005)
[20]
Bonomi, F., Mitzenmacher, M., Panigrah, R., Singh, S., Varghese, G.: Beyond Bloom Filters: from Approximate Membership Checks to Approximate State Machines. In: 2006 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 315-326. Association for Computing Machinery, New York (2006)
[21]
Sung, M., Xu, J., Li, J., Li, L.: Large-scale IP Traceback in Highspeed Internet: Practical Techniques and Information-theoretic Foundation. IEEE/ACM Transaction on Networking 16(6), 1253-1266 (2008)
[22]
Jokela, P., Zahemszky, A., Esteve, C., Arianfar, S., Nikander, P.: LIPSIN: Line Speed Publish/Subscribe Inter-Networking. In: 2009 ACM SIGCOMM 2009 Conference on Data Communication, pp. 195-206. Association for Computing Machinery, New York (2009)
[23]
Zhu, Y., Jiang, H., Wang, J., Xian, F.: HBA: Distributed Metadata Management for Large Cluster-Based Storage Systems. IEEE Transactions on Parallel and Distributed Systems 19(6), 750-763 (2008)
[24]
Wang, Z., Luo, T.: Optimizing Hash Function Number for BF-Based Object Locating Algorithm. In: Tan, Y., Shi, Y., Ji, Z. (eds.) ICSI 2012, Part II. LNCS, vol. 7332, pp. 543-552. Springer, Heidelberg (2012)
[25]
Bruck, J., Gao, J., Jiang, A.: Weighted Bloom Filter. In: 2006 IEEE International Symposium on Information Theory, pp. 2304-2308. IEEE Press, New York (2006)

Cited By

View all
  • (2018)Selection Optimization of Bloom Filter-Based Index Services in Ubiquitous Embedded SystemsWeb Services – ICWS 201810.1007/978-3-319-94289-6_15(231-245)Online publication date: 25-Jun-2018
  • (2015)Fast approximate hash table using extended counting Bloom filterInternational Journal of Computational Science and Engineering10.1504/IJCSE.2015.07349711:4(380-390)Online publication date: 1-Dec-2015

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICPCA/SWS'12: Proceedings of the 2012 international conference on Pervasive Computing and the Networked World
November 2012
917 pages
ISBN:9783642370144
  • Editors:
  • Qiaohong Zu,
  • Bo Hu,
  • Atilla Elçi

Sponsors

  • ETH Zurich
  • CCF: China Computer Federation

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 28 November 2012

Author Tags

  1. ZPBA
  2. Zipf's law
  3. bloom filter
  4. indexing
  5. object locating

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Selection Optimization of Bloom Filter-Based Index Services in Ubiquitous Embedded SystemsWeb Services – ICWS 201810.1007/978-3-319-94289-6_15(231-245)Online publication date: 25-Jun-2018
  • (2015)Fast approximate hash table using extended counting Bloom filterInternational Journal of Computational Science and Engineering10.1504/IJCSE.2015.07349711:4(380-390)Online publication date: 1-Dec-2015

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media