[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/ICDE.2005.53guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Dynamic Load Distribution in the Borealis Stream Processor

Published: 05 April 2005 Publication History

Abstract

Distributed and parallel computing environments are becoming cheap and commonplace. The availability of large numbers of CPUýs makes it possible to process more data at higher speeds. Stream-processing systems are also becoming more important, as broad classes of applications require results in real-time. Since load can vary in unpredictable ways, exploiting the abundant processor cycles requires effective dynamic load distribution techniques. Although load distribution has been extensively studied for the traditional pull-based systems, it has not yet been fully studied in the context of push-based continuous query processing. In this paper, we present a correlation based load distribution algorithm that aims at avoiding overload and minimizing end-to-end latency by minimizing load variance and maximizing load correlation. While finding the optimal solution for such a problem is NP-hard, our greedy algorithm can find reasonable solutions in polynomial time. We present both a global algorithm for initial load distribution and a pair-wise algorithm for dynamic load migration.

References

[1]
D. Abadi, Y. Ahmad, H. Balakrishnan, M. Balazinska, U. Cetintemel, M. Cherniack, J. Hwang, J. Jannotti, W. Lindner, S. Madden, A. Rasin, M. Stonebraker, N. Tatbul, Y. Xing, S. Zdonik, The Design of the Borealis Stream Processing Engine. In Proc. of the Second Biennial Conference on Innovative Data Systems Research (CIDR), Jan. 2005.
[2]
D. Abadi, D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul and Stan Zdonik. Aurora: A New Model and Architecture for Data Stream Management. VLDB Journal, Sep. 2003.
[3]
A. Adas, Traffic models in broadband networks. IEEE Communications, 35(7):82-89, July 1997.
[4]
M. Balazinska, H. Balakrishnan, and M. Stonebraker. Contract-based load management in federated distributed systems. In USENIX Symposium on Net-worked Systems Design and Implementation (NSDI), March 2004.
[5]
S. Chandrasekaran, A. Deshpande, M. Franklin, J. Hellerstein, W. Hong, S. Krishnamurthy, S. Madden, V. Raman, F. Reiss, and M. Shah. TelegraphCQ: Continuous dataflow processing for an uncertain world. In Proc. of the CIDR Conference, Jan. 2003.
[6]
M. Cherniack, H. Balakrishnan, M. Balazinska, D. Carney, U. Cetintemel, Y. Xing and S. Zdonik. Scalable Distributed Stream Processing. In Proc. of the CIDR Conference, 2003.
[7]
R. Diekmann, B. Monien, and R. Preis, Load balancing strategies for distributed memory machines. Multi-Scale Phenomena and Their Simulation, 255-266. World Scientific, 1997.
[8]
A. Foong, T. Huff, H. Hum, J. Patwardhan, G. Regnier, TCP performance re-visited. In Proc. of IEEE Intl Symposium on Performance of Systems and Software, March 2003.
[9]
A. Gallatin, J. Chase, and K. Yocum, Trapeze/IP: TCP/IP at near-gigabit speeds. In Proc. of USENIX Technical Conference, June 1999.
[10]
M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, Freeman, New York, 1979.
[11]
D. Gupta and P. Bepari, Load sharing in distributed systems, In Proc. of the National Workshop on Distributed Computing, January 1999.
[12]
Mesquite Software, Inc. CSIM 18 Simulation Engine. http://www.mesquite.com/
[13]
R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma. Query processing, approximation, and resource management in a data stream management system. In Proc. of the CIDR Conference, 2003.
[14]
M.A. Shah, J.M. Hellerstein, S. Chandrasekaran, and M.J. Franklin. Flux: An Adaptive Partitioning Operator for Continuous Query Systems. In Proc. of the ICDE Conference, pages 25-36, 2003.
[15]
K. Schloegel, George Karypis and Vipin Kumar. Graph Partitioning for High Performance Scientific Simulations. CRPC Parallel Computing Handbook. Morgan Kaufmann, 2000.
[16]
B. A. Shirazi, A. R. Hurson, and K. M. Kavi. Scheduling and load balancing in parallel and distributed systems. IEEE Computer Science Press, 1995.
[17]
C. Walshaw, M. Cross, and M. G. Everett, Dynamic load balancing for parallel adaptive unstructured meshes. Parallel Processing for Scientific Computing, 1997. 10.
[18]
W. Willinger, M.S. Taqqu, R. Sherman, and D.V. Wilson, Self-similarity through high variability: statistical analysis of Ethernet LAN traffic at the source level. IEEE/ACM Transactions on Networking, 5(1):71-86, 1997.
[19]
Ying Xing. Load Distribution for Distributed Stream Processing. In Proc. of the ICDE Ph.D. Workshop, 2004.
[20]
C. Xu, B. Monien, R. Luling, and F. Lau. Nearest neighbor algorithms for load balancing in parallel computers. Concurrency: Practice and Experience, 9(12):1351-1376, 1997.

Cited By

View all
  • (2024)Distributed Load Balancing in the Face of Reappearance DependenciesProceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659968(321-330)Online publication date: 17-Jun-2024
  • (2021)System-aware dynamic partitioning for batch and streaming workloadsProceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing10.1145/3468737.3494087(1-10)Online publication date: 6-Dec-2021
  • (2021)LachesisProceedings of the 22nd International Middleware Conference10.1145/3464298.3493407(365-378)Online publication date: 6-Dec-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICDE '05: Proceedings of the 21st International Conference on Data Engineering
April 2005
8301 pages
ISBN:0769522858

Publisher

IEEE Computer Society

United States

Publication History

Published: 05 April 2005

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Distributed Load Balancing in the Face of Reappearance DependenciesProceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659968(321-330)Online publication date: 17-Jun-2024
  • (2021)System-aware dynamic partitioning for batch and streaming workloadsProceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing10.1145/3468737.3494087(1-10)Online publication date: 6-Dec-2021
  • (2021)LachesisProceedings of the 22nd International Middleware Conference10.1145/3464298.3493407(365-378)Online publication date: 6-Dec-2021
  • (2019)Language-integrated privacy-aware distributed queriesProceedings of the ACM on Programming Languages10.1145/33605933:OOPSLA(1-30)Online publication date: 10-Oct-2019
  • (2018)ReferencesMaking Databases Work10.1145/3226595.3226642(635-644)Online publication date: 1-Dec-2018
  • (2018)The collected works of Michael StonebrakerMaking Databases Work10.1145/3226595.3226641(606-633)Online publication date: 1-Dec-2018
  • (2018)The design and implementation of INGRESMaking Databases Work10.1145/3226595.3226640(561-605)Online publication date: 1-Dec-2018
  • (2018)The implementation of POSTGRESMaking Databases Work10.1145/3226595.3226639(519-559)Online publication date: 1-Dec-2018
  • (2018)C-storeMaking Databases Work10.1145/3226595.3226638(491-518)Online publication date: 1-Dec-2018
  • (2018)The end of an architectural eraMaking Databases Work10.1145/3226595.3226637(463-489)Online publication date: 1-Dec-2018
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media