Abstract
The hybrid-hash algorithm and its parallel variant have been recently found to outperform all other algorithms in joining disk-based large relations on uniprocessor and ring-interconnected distributed-memory multiprocessor database machines. This paper presents several extensions to the centralized and distributed hybrid-hash algorithms. These extensions are based on the usage of one or more bit-vectors as filters. A comparative performance study to the presented algorithms is carried out. In a uniprocessor environment, this study shows that one of the proposed filter-based algorithms outperforms all of the other ones, including the hybrid-hash algorithm. In a distributed environment, the filter-based algorithms are found to suffer from a serious problem, namely, overloading the interconnection network with the transmission of large size bit-vectors. Different compression schemes are proposed to reduce the size of a transmitted bit-vector. The augmentation of the distributed version of best-performing centralized algorithm with one of the proposed compression schemes have been found to outperform all of the other algorithms and substantially improves the performance of the join operation.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Babb, E., "Implementing a Relational Database by Means of Specialized Hardware." ACM Trans. on Database Systems, Vol. 4, No. 6 (June 1979), pp. 414–429.
Baru, C. and Frieder, F., "Implementing Relational Database Operations in a Cube-Connected Multicomputer System." Proceedings of the Third International Conference on Data Engineering, 1987.
Baru, C. K. and Su, S. Y. W., "The Architecture of SM3: A Dynamically Partitionable Multicomputer System." IEEE Transaction on Computers, Vol. C-35, No. 9(September 1986), pp. 790–802.
Bitton, D., et al., "Parallel Algorithms for the Execution of Relational Database Operations." ACM Trans. on Database Systems, Vol.8, No. 3(September 1983), pp. 324–353.
Dewitt, D. J., "DIRECT-A Multiprocessor Organization for Supporting Relational Database Management Systems." IEEE Transaction on Computers, Vol. C-28, No. 6(June 1979), pp. 395–408.
Dewitt, D. J., et al., "A Single User Evaluation for the Gamma Database Machine." Proceedings of the 5th International Workshop on Database Machines, 1987.
Dewitt, D., J., et al., "GAMMA — A High Performance Dataflow Database Machine." Proceedings of the 12th International Conference on Very Large Databases, 1986, pp. 228–237.
Dewitt, D., J. and Gerber, R., "Multiprocessor Hash-Based Join Algorithms." Proceedings of VLDB, 1985, pp. 151–164.
Dewitt, D. J. et al., "Implementation Techniques for Large Main Memory Database Systems." Proceedings of SIGMOD, 1984.
Gardarin, G., "An Introduction to SABRE: A Multi-Microprocessor Database Machine." Proceedings of the 6th Workshop on Computer Architecture for Non-numeric Processing, Hyeres, France, June 1981.
Gerber, R., "Dataflow Query Processing Using Multiprocessor Hash-Partitioned Algorithms." Tech. Report #672, Computer Sciences Department, University of Wisconsin-Madison, Oct. 1986.
Goodman, J. R., "An Investigation of Multiprocessor Structures and Algorithms for Database Management." Memo No. UCB/ERLM81 (May 81), Electronic Research Lab., College of Engineering, University of California, Berkeley.
Hsiao, D. K., and Menon, M. J., "Design and Analysis of a Multi-Backend Database System for Performance Improvements, Functionality Expansion and Capacity Growth (Part I and II)." Technical Reports, OSU-CISRC-TR-81-7 and OSU-CISRC-TR-81-8, The Ohio State University, Columbus, Ohio, 1981.
Kim, W., "Relational Database Systems." ACM Computing Survey, Vol. 11, No.3, 1979, pp. 185–211.
Kitsuregawa, et al., "Architecture and Performance of Relational Algebra Machine GRACE." Proceedings of the International Conference on Parallel Processing, 1984, pp. 241–250.
Qadah, G. Z. and Irani, K. B., "The Join Operation on A Shared-memory Multiprocessor Database Machine." to appear in the IEEE Transaction on Software Engineering.
Qadah, G. Z., "Filter-based Algorithms on Uniprocessor and Distributed-memory Multiprocessor Database Machines." Technical Report # 87-06-DBM-03, EECS Department, Northwestern University, 1987.
Qadah, G. Z., "Database Machines: A Survey." Proceedings of the National Computer Conference, AFIPS Press, 1985, pp.211–223.
Qadah, G. Z. and Irani, K. B., "A Database Machine for Very Large Relational Databases." IEEE Transaction on Computers, Vol. C-34, No. 11(November 1985), pp. 1015–1025.
Schweppe, H., Zeidler, H., Hell, W., Leilich, H., Stiege, G. and Teich, W., "RDBM-A Dedicated Multiprocessor System for Database Management." Advanced Database Architecture, Hsiao, D. K.(ed.), Prentice-Hall, 1983, pp. 36–86.
Seitz, C., "The Cosmic Cube." Communication of ACM, Vol. 28, No. 1 (Jan. 1985), pp. 22–33.
Shultz, R. and Miller, lla, "Tree Structured Multiple Processor Join Methods." Proceedings of the 3rd Data Engineering Conference, 1987, pp. 190–199.
Teradata: DBC/1012 Data Base Computer Concepts and Facilities, Teradata Corp. Document No. C02-0001-01, 1984.
Valduriez, P. and Gardarin, G., "Join and Semijoin Algorithms for a Multiprocessor Database Machine." ACM Trans. on Database Systems, Vol. 9, No. 1(March 1984), pp. 133–161.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1988 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Qadah, G.Z. (1988). Filter-based join algorithms on uniprocessor and distributed-memory multiprocessor database machines. In: Schmidt, J.W., Ceri, S., Missikoff, M. (eds) Advances in Database Technology—EDBT '88. EDBT 1988. Lecture Notes in Computer Science, vol 303. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-19074-0_64
Download citation
DOI: https://doi.org/10.1007/3-540-19074-0_64
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-19074-5
Online ISBN: 978-3-540-39095-4
eBook Packages: Springer Book Archive