[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3491003.3491008acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicdcnConference Proceedingsconference-collections
research-article
Open access

A Study on Migration Scheduling in Distributed Stream Processing Engines

Published: 24 January 2022 Publication History

Abstract

The cost of migrating stateful operators in distributed stream processing has attracted research attention. Reactive migration in response to context changes is the common approach. Other migration scheduling strategies, like proactive migration based on prediction, and delayed migration are nearly neglected.
This paper investigates four algorithms that explore these alternative scheduling strategies. The algorithms are implemented in a prototype stream processing overlay and run over an emulated network. Experiments with synthetic workload reveal that (1) proactive migration can reduce average event delivery latency, and (2) that it is important to handle noise in the data to avoid wrong adaptations. Experiments with real workload demonstrate that (1) pro-activity is not always beneficial, and (2) careful timing of migration depending on operator state, has a large potential to limit overhead. The experiments demonstrate a reduction in state size of 38 %, resulting in a 30 % reduction in freeze time. Consideration of operator state size is especially important. The state transfer can lead to contention that further harms event delivery and/or causes network timeouts for cases with limited network resources.

References

[1]
Yanif Ahmad, Bradley Berg, Uǧur Cetintemel, Mark Humphrey, Jeong-Hyon Hwang, Anjali Jhingran, Anurag Maskey, Olga Papaemmanouil, Alexander Rasin, Nesime Tatbul, Wenjuan Xing, Ying Xing, and Stan Zdonik. 2005. Distributed operation in the Borealis stream processing engine. In Proceedings of the international conference on Management of data (SIGMOD). ACM.
[2]
Arvind Arasu, Brian Babcock, Shivnath Babu, Mayur Datar, Keith Ito, Itaru Nishizawa, Justin Rosenstein, and Jennifer Widom. 2003. STREAM: the stanford stream data manager (demonstration description). In Proceedings of the international conference on Management of data (SIGMOD). ACM.
[3]
Paris Carbone, Asterios Katsifodimos, Stephan Ewen, V. Markl, Seif Haridi, and Kostas Tzoumas. 2015. Apache Flink™: Stream and Batch Processing in a Single Engine. IEEE Data Eng. Bull.(2015).
[4]
Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong, Sailesh Krishnamurthy, Samuel R. Madden, Fred Reiss, and Mehul A. Shah. 2003. TelegraphCQ: continuous dataflow processing. In Proceedings of the international conference on Management of data (SIGMOD). ACM.
[5]
Bonaventura Del Monte, Steffen Zeuch, Tilmann Rabl, and Volker Markl. 2020. Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines. In Proceedings of the International Conference on Management of Data(SIGMOD). ACM.
[6]
Rahul Dwarakanath, Boris Koldehofe, and Ralf Steinmetz. 2016. Operator Migration for Distributed Complex Event Processing in Device-to-Device Based Networks. In Proceedings of the 3rd Workshop on Middleware for Context-Aware Applications in the IoT (M4IoT). ACM.
[7]
Nikolas Roman Herbst, Nikolaus Huber, Samuel Kounev, and Erich Amrehn. 2013. Self-Adaptive Workload Classification and Forecasting for Proactive Resource Provisioning. In Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering. ACM.
[8]
Jeyhun Karimov, Tilmann Rabl, Asterios Katsifodimos, Roman Samarev, Henri Heiskanen, and Volker Markl. 2018. Benchmarking Distributed Stream Data Processing Systems. In International Conference on Data Engineering (ICDE). IEEE.
[9]
Beate Ottenwälder, Boris Koldehofe, Kurt Rothermel, Kirak Hong, David Lillethun, and Umakishore Ramachandran. 2014. MCEP: A Mobility-Aware Complex Event Processing System. ACM Trans. Internet Technol.(2014).
[10]
P. Pietzuch, J. Ledlie, J. Shneidman, M. Roussopoulos, M. Welsh, and M. Seltzer. 2006. Network-Aware Operator Placement for Stream-Processing Systems. In 22nd International Conference on Data Engineering (ICDE). IEEE.
[11]
U. Raza, A. Camerra, A. L. Murphy, T. Palpanas, and G. P. Picco. 2015. Practical Data Prediction for Real-World Wireless Sensor Networks. IEEE Transactions on Knowledge and Data Engineering (TKDE) (2015).
[12]
Attila Reiss and Didier Stricker. 2012. Introducing a New Benchmarked Dataset for Activity Monitoring. In The 16th International Symposium on Wearable Computers (ISWC). IEEE.
[13]
Daniela Tulone and Samuel Madden. 2006. An Energy-Efficient Querying Framework in Sensor Networks for Detecting Node Similarities. In Proceedings of the International Symposium on Modeling Analysis and Simulation of Wireless and Mobile Systems(MSWiM). ACM.
[14]
Matei Zaharia, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J. Franklin, Ali Ghodsi, Joseph Gonzalez, Scott Shenker, and Ion Stoica. 2016. Apache Spark: A Unified Engine for Big Data Processing. Commun. ACM (2016).

Cited By

View all
  • (2023)To Migrate or Not to Migrate: An Analysis of Operator Migration in Distributed Stream ProcessingIEEE Communications Surveys & Tutorials10.1109/COMST.2023.333095326:1(670-705)Online publication date: 7-Nov-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICDCN '22: Proceedings of the 23rd International Conference on Distributed Computing and Networking
January 2022
298 pages
ISBN:9781450395601
DOI:10.1145/3491003
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 January 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Distributed Systems
  2. Migration
  3. Prediction
  4. Stream Processing

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICDCN '22

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)120
  • Downloads (Last 6 weeks)21
Reflects downloads up to 11 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)To Migrate or Not to Migrate: An Analysis of Operator Migration in Distributed Stream ProcessingIEEE Communications Surveys & Tutorials10.1109/COMST.2023.333095326:1(670-705)Online publication date: 7-Nov-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media