[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Comparing integer data structures for 32- and 64-bit keys

Published: 17 March 2010 Publication History

Abstract

In this article, we experimentally compare a number of data structures operating over keys that are 32- and 64-bit integers. We examine traditional comparison-based search trees as well as data structures that take advantage of the fact that the keys are integers such as van Emde Boas trees and various trie-based data structures. We propose a variant of a burst trie that performs better in time than all the alternative data structures. In addition, even for small sets of keys, this burst trie variant occupies less space than comparison-based data structures such as red-black trees and B-trees. Burst tries have previously been shown to provide a very efficient base for implementing cache efficient string sorting algorithms. We find that with suitable engineering, they also perform excellently as a dynamic ordered data structure operating over integer keys. We provide experimental results when the data structures operate over uniform random data. We also present experimental results for other types of data, including datasets arising from Valgrind, a widely used suite of tools for the dynamic binary instrumentation of programs.

References

[1]
Acharya, A., Zhu, H., and Shen, K. 1999. Adaptive algorithms for cache-efficient tries search. In Proceedings of the 1st International Workshop on Algorithm Engineering and Experimentation (ALENEX'99). Springer-Verlag, Berlin, 296--311.
[2]
Andersson, A. and Nilsson, S. 1993. Improved behavior of tries by adaptive branching. Inf. Process. Lett. 46, 6, 295--300.
[3]
Askitis, N. 2009. Fast and compact hash tables for integer keys. In Proceedings of the 32nd Australasian Computer Science Conference (ACSC'09). ACS, Wellington, New Zealand, 101--110.
[4]
Bayer, R. and McCreight, E. M. 1972. Organization and maintenance of large ordered indices. Acta. Inf. 1, 173--189.
[5]
Bender, M. A., Demaine, E. D., and Farach-Colton, M. 2000. Cache-oblivious b-trees. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science (FOCS'00). IEEE Computer Society, Los Alamitos, CA, 399.
[6]
Brent, R. P. 2004. Note on marsaglia's xorshift random number generators. J. Stat. Software 11, 5, 1--4.
[7]
Brodal, G. S., Fagerberg, R., and Jacob, R. 2002. Cache oblivious search trees via binary trees of small height. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'02). Society for Industrial and Applied Mathematics, Philadelphia, 39--48.
[8]
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. 2001. Introduction to Algorithms, 2nd ed. MIT Press, Cambridge, MA, 273--301.
[9]
Crosbie, R. and Gregg, D. 2009. Searching the parameter space for burst tries.
[10]
Demaine, E. 2003. Fixed universe successor problem. Advanced Data Structures, MIT Course 6.897.
[11]
Dementiev, R., Kettner, L., Mehnert, J., and Sanders, P. 2004. Engineering a sorted list data structure for 32 bit keys. In Proceedings of the 6th SIAM Workshop on Algorithm Engineering and Experiments. SIAM, Philadelphia, 142--151.
[12]
Dietzfelbinger, M., Hühne, M., and Weidling, C. 2008. A dictionary implementation based on dynamic perfect hashing. J. Exp. Algorithmics 12, 1--25.
[13]
Dietzfelbinger, M., Karlin, A., Mehlhorn, K., Meyer auf der Heide, F., Rohnert, H., and Tarjan, R. E. 1994. Dynamic perfect hashing: Upper and lower bounds. SIAM J. Comput. 23, 4, 738--761.
[14]
Frias, L., Petit, J., and Roura, S. 2009. Lists revisited: Cache conscious STL lists. J. Exp. Algorithmics 14, 3.5, 9.
[15]
Heinz, S., Zobel, J., and Williams, H. E. 2002. Burst tries: A fast, efficient data structure for string keys. ACM Trans. Inf. Syst. 20, 2, 192--223.
[16]
IEEE. 2008. Standard for floating-point arithmetic. IEEE Std 754-2008, 1--58.
[17]
Intel. 2007. Intel 64 and IA-32 Architectures Optimization Reference Manual.
[18]
Janson, S. and Szpankowski, W. 2007. Partial fill up and search time in lc tries. ACM Trans. Algorithms 3, 4, 44.
[19]
Kasheff, Z. 2004. M.eng., Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology.
[20]
Knessl, C. and Szpankowski, W. 2000a. Heights in generalized tries and patricia tries. In Proceeding of the 4th Latin American Symposium on Theoretical Informatics (LATIN'00). Springer, Berlin, 298--307.
[21]
Knessl, C. and Szpankowski, W. 2000b. A note on the asymptotic behavior of the heights in b-tries for b large. Electr. J. Comb. 7.
[22]
Knuth, D. E. 1998. The Art of Computer Programming: Sorting and Searching, vol. 3 2nd ed Addison Wesley Longman Publishing Co., Redwood City, CA, 458--478, 482--491, 506.
[23]
Korda, M. and Raman, R. 1999. An experimental evaluation of hybrid data structures for searching. In Proceedings of the 3rd International Workshop on Algorithm Engineering (WAE'99). Springer, Berlin, 213--227.
[24]
Manegold, S. and Boncz, P. 2004. Cache-memory and tlb calibration tool. http://homepages.cwi.nl/~manegold/Calibrator/calibrator.shtml
[25]
Mehlhorn, K. and Näher, S. 1990. Bounded ordered dictionaries in o(loglogn) time and o(n) space. Inf. Process. Lett. 35, 4, 183--189.
[26]
Nash, N. and Gregg, D. 2008. Comparing integer data structures for 32 and 64 bit keys. In Proceedings of the 7th International Workshop on Experimental Algorithms. Springer, Berlin, 28--42.
[27]
Nilsson, S. 1996. Radix sorting and searching. Ph.D. thesis, Lund University.
[28]
Nilsson, S. and Tikkanen, M. 2002. An experimental study of compression methods for dynamic tries. Algorithmica 33, 1, 19--33.
[29]
Rao, J. and Ross, K. A. 2000. Making b+-trees cache conscious in main memory. SIGMOD Rec. 29, 2, 475--486.
[30]
Sinha, R. 2004. Using compact tries for cache-efficient sorting of integers. In Proceedings of the 3rd International Workshop on Efficient and Experimental Algorithms (WEA'04). Springer, Berlin, 513--528.
[31]
Sinha, R., Ring, D., and Zobel, J. 2006. Cache-efficient string sorting using copying. J. Exp. Algorithmics 11, 1.2.
[32]
Sinha, R. and Wirth, A. 2008. Engineering burst-sort: Towards fast in-place string sorting. In Proceedings of the 7th International Workshop on Efficient and Experimental Algorithms (WEA'08). Springer, Berlin, 14--27.
[33]
Sinha, R. and Zobel, J. 2004. Cache-conscious sorting of large sets of strings with dynamic tries. J. Exp. Algorithmics 9, 1.5.
[34]
Sinha, R. and Zobel, J. 2005. Using random sampling to build approximate tries for efficient string sorting. J. Exp. Algorithmics 10, 2, 10.
[35]
Sleator, D. D. and Tarjan, R. E. 1985. Self-adjusting binary search trees. J. ACM 32, 3, 652--686.
[36]
Stroustrup, B. 1997. The C++ Programming Language, 3rd ed. Addison-Wesley, Boston.
[37]
Sussenguth, E. H. 1963. Use of tree structures for processing files. Commun. ACM 6, 5, 272--279.
[38]
van Emde Boas, P. 1977. Preserving order in a forest in less than logarithmic time and linear space. Inf. Process. Lett. 6, 3, 80--82.
[39]
Willard, D. E. 1984. New trie data structures which support very fast search operations. J. Comput. Syst. Sci. 28, 3, 379--394.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Journal of Experimental Algorithmics
ACM Journal of Experimental Algorithmics  Volume 15, Issue
2010
387 pages
ISSN:1084-6654
EISSN:1084-6654
DOI:10.1145/1671970
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 March 2010
Accepted: 01 November 2009
Revised: 01 September 2009
Received: 01 March 2009
Published in JEA Volume 15

Author Tags

  1. Integer keys
  2. level compression
  3. searching
  4. trees
  5. tries

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media