[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1142473.1142522acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Design, implementation, and evaluation of the linear road bnchmark on the stream processing core

Published: 27 June 2006 Publication History

Abstract

Stream processing applications have recently gained significant attention in the networking and database community. At the core of these applications is a stream processing engine that performs resource allocation and management to support continuous tracking of queries over collections of physically-distributed and rapidly-updating data streams. While numerous stream processing systems exist, there has been little work on understanding the performance characteristics of these applications in a distributed setup. In this paper, we examine the performance bottlenecks of streaming data applications, in particular the Linear Road stream data management benchmark, in achieving good performance in large-scale distributed environments, using the Stream Processing Core (SPC), a stream processing middleware we have developed. First, we present the design and implementation of the Linear Road benchmark on the SPC middleware. SPC has been designed to scale to tens of thousands of processing nodes, while supporting concurrent applications and multiple simultaneous queries. Second, we identify the main performance bottlenecks in the Linear Road application in achieving scalability and low query response latency. Our results show that data locality, buffer capacity, physical allocation of processing elements to infrastructure nodes, and packaging for transporting streamed data are important factors in achieving good application performance. Though we evaluate our system primarily for the Linear Road application, we believe it also provides useful insights into the overall system behavior for supporting other distributed and large-scale continuous streaming data applications. Finally, we examine how SPC can be used and tuned to enable a very efficient implementation of the Linear Road application in a distributed environment.

References

[1]
{1} http://mit.edu/its/mitsimlab.html.
[2]
{2} http://www.cs.brandeis.edu/~linearroad.
[3]
{3} http://www.cs.brown.edu/research/aurora/main.html.
[4]
{4} D. Abadi, D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Aurora: A new model and architecture for data stream management. VLDB Journal, 12(2), August 2003.
[5]
{5} D. J. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. The design of the Borealis stream processing engine. In Proceedings of the 2005 Conference on Innovative Data Systems Research (CIDR 2005), Asilomar, CA, 2005.
[6]
{6} L. Amini, H. Andrade, F. Eskesen, R. King, Y. Park, P. Selo, and C. Venkatramani. The Stream Processing Core. Technical Report RSC 23798 (submitted for publication), IBM T. J. Watson Research Center, November 2005.
[7]
{7} L. Amini, N. Jain, A. Sehgal, J. Silber, and O. Verscheure. Adaptive Control of Extreme-Scale Stream Processing Systems. In Proceedings of the 26th International Conference on Distributed Computing Systems (ICDCS 2006), Lisboa, Portugal, July 2006.
[8]
{8} A. Arasu, B. Babcock, M. Datar, K. Ito, I. Nishizawa, J. Rosenstein, and J. Widom. STREAM: The Stanford Stream Data Manager (Demonstration Description). In Proceedings of the 2003 ACM International Conference on Management of Data (SIGMOD 2003), San Diego, CA, June 2003.
[9]
{9} A. Arasu, M. Cherniack, E. Galvez, D. Maier, A. S. Maskey, E. Ryvkina, M. Stonebraker, and R. Tibbetts. Linear Road: A stream data management benchmark. In Proceedings of the 30th International Conference on Very Large Data Bases Conference (VLDB 2004), Toronto, Canada, 2004.
[10]
{10} M. D. Beynon, T. Kurc, U. Catalyurek, C. Chang, A. Sussman, and J. Saltz. Distributed processing of very large datasets with DataCutter. Parallel Computing, 27(11), October 2001.
[11]
{11} S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. Madden, V. Raman, F. Reiss, and M. Shah. TelegraphCQ: Continuous dataflow processing for an uncertain world. In Proceedings of the 2003 Conference on Innovative Data Systems Research (CIDR 2003), Asilomar, CA, 2003.
[12]
{12} N. Jain, L. Amini, H. Andrade, R. King, Y. Park, P. Selo, and C. Venkatramani. Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core. Technical Report TR-06-18, Department of Computer Sciences, University of Texas at Austin, March 2006.
[13]
{13} K. Kuo, R. Rabbah, and S. Amarasinghe. A productive programming environment for stream computing. In Proceedings of the 2nd Second Workshop on Productivity and Performance in High-End Computing, San Francisco, CA, February 2005.
[14]
{14} S. R. Madden, M. A. Shah, J. M. Hellerstein, and V. Raman. Continuously adaptive continuous queries over streams. In Proceedings of the 2002 ACM International Conference on Management of Data (SIGMOD 2002), Madison, WI, June 2002.
[15]
{15} C. Pu, K. Schwan, and J. Walpole. Infosphere project: System support for information flow applications. ACM SIGMOD Record, 30(1), March 2001.
[16]
{16} G. Swint, G. Jung, and C. Pu. Event-based QoS for a distributed continual query system. In Proceedings of the 2005 IEEE International Conference on Information Reuse and Integration (IRI 2005), Las Vegas, NV, August 2005.
[17]
{17} W. Thies, M. Karczmarek, and S. Amarasinghe. StreamIt: A language for streaming applications. In Proceedings of the 2002 International Conference on Compiler Construction (ICCC 2002), Grenoble, France, April 2002.
[18]
{18} S. Zdonik, M. Stonebraker, M. Cherniak, U. Cetintemel, M. Balazinska, and H. Balakrishnan. The Aurora and Medusa projects. Bulletin of the IEEE Technical Committee on Data Engineering, March 2003.

Cited By

View all
  • (2023)A systematic mapping of performance in distributed stream processing systems2023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)10.1109/SEAA60479.2023.00052(293-300)Online publication date: 6-Sep-2023
  • (2022)Blue Danube: A Large-Scale, End-to-End Synchronous, Distributed Data Stream Processing Architecture for Time-Sensitive Applications2022 IEEE/ACM 26th International Symposium on Distributed Simulation and Real Time Applications (DS-RT)10.1109/DS-RT55542.2022.9932034(39-48)Online publication date: 26-Sep-2022
  • (2022)Stream BenchmarksEncyclopedia of Big Data Technologies10.1007/978-3-319-63962-8_299-2(1-6)Online publication date: 24-May-2022
  • Show More Cited By

Index Terms

  1. Design, implementation, and evaluation of the linear road bnchmark on the stream processing core

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data
    June 2006
    830 pages
    ISBN:1595934340
    DOI:10.1145/1142473
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 June 2006

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. bottleneck analysis
    2. distributed stream processing systems
    3. linear road
    4. performance evaluation

    Qualifiers

    • Article

    Conference

    SIGMOD/PODS06
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A systematic mapping of performance in distributed stream processing systems2023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)10.1109/SEAA60479.2023.00052(293-300)Online publication date: 6-Sep-2023
    • (2022)Blue Danube: A Large-Scale, End-to-End Synchronous, Distributed Data Stream Processing Architecture for Time-Sensitive Applications2022 IEEE/ACM 26th International Symposium on Distributed Simulation and Real Time Applications (DS-RT)10.1109/DS-RT55542.2022.9932034(39-48)Online publication date: 26-Sep-2022
    • (2022)Stream BenchmarksEncyclopedia of Big Data Technologies10.1007/978-3-319-63962-8_299-2(1-6)Online publication date: 24-May-2022
    • (2021)AmoebaProceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing10.1145/3468737.3494096(1-10)Online publication date: 6-Dec-2021
    • (2021)Fork and Join Queueing Networks with Heavy Tails: Scaling Dimension and Throughput LimitJournal of the ACM10.1145/344821368:3(1-30)Online publication date: 25-May-2021
    • (2021)Heterogeneity-aware elastic scaling of streaming applications on cloud platformsThe Journal of Supercomputing10.1007/s11227-021-03692-wOnline publication date: 5-Mar-2021
    • (2019)Analyzing efficient stream processing on modern hardwareProceedings of the VLDB Endowment10.14778/3303753.330375812:5(516-530)Online publication date: 1-Jan-2019
    • (2019)BriskStreamProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3300067(705-722)Online publication date: 25-Jun-2019
    • (2019)An Adaptive Online Scheme for Scheduling and Resource Enforcement in StormIEEE/ACM Transactions on Networking10.1109/TNET.2019.291834127:4(1373-1386)Online publication date: 1-Aug-2019
    • (2019)Real-Time Stream Data Processing at Scale2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)10.1109/PDCAT46702.2019.00020(46-51)Online publication date: Dec-2019
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media