[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Characterizing memory requirements for queries over continuous data streams

Published: 01 March 2004 Publication History

Abstract

This article deals with continuous conjunctive queries with arithmetic comparisons and optional aggregation over multiple data streams. An algorithm is presented for determining whether or not any given query can be evaluated using a bounded amount of memory for all possible instances of the data streams. For queries that can be evaluated using bounded memory, an execution strategy based on constant-sized synopses of the data streams is proposed. For queries that cannot be evaluated using bounded memory, data stream scenarios are identified in which evaluating the queries requires memory linear in the size of the unbounded streams.

Supplementary Material

PDF File (p162-arasu-app.pdf)
Article appendix

References

[1]
Alon, N., Matias, Y., and Szegedy, M. 1999. The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58, 1 (Feb.), 137--147.
[2]
Arasu, A., Babcock, B., Babu, S., McAlister, J., and Widom, J. 2002a. Characterizing memory requirements for queries over continuous data streams. In Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM, New York, 221--232.
[3]
Arasu, A., Babu, S., and Widom, J. 2002b. An abstract semantics and concrete language for continuous queries over streams and relations. Tech. Rep. http://dbpubs.stanford.edu/pub/2002-57, Stanford University. Nov.
[4]
Babcock, B., Babu, S., Datar, M., Motwani, R., and Widom, J. 2002. Models and issues in data stream systems. In Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM, New York, 1--16.
[5]
Babcock, B., Datar, M., and Motwani, R. 2002. Sampling from a moving window over streaming data. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM, New York, 633--634.
[6]
Babu, S. and Widom, J. 2002. Exploiting k-constraints to reduce memory overhead in continuous queries over data streams. Tech. Rep. http://dbpubs.stanford.edu/pub/2002-52, Stanford University. Nov.
[7]
Carney, D., Centintemel, U., Cherniack, M., Convey, C., Lee, S., Seidman, G., Stonebraker, M., Tatbul, N., and Zdonik, S. B. 2002. Monitoring streams---A new class of data management applications. In Proceedings of the 28th International Conference on Very Large Data Bases. Morgan-Kaufmann, San Mateo, Calif., 215--226.
[8]
Chandrasekharan, S., Cooper, O., Deshpande, A., Franklin, M. J., Hellerstein, J. M., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Resis, F., and Shah, M. A. 2003. TelegraphCQ: Continuous dataflow processing for an uncertain world. In Proceedings of the 1st Conference on Innovative Data Systems Research. 269--280.
[9]
Chandrasekharan, S. and Franklin, M. J. 2002. Streaming queries over streaming data. In Proceedings of the 28th International Conference on Very Large Data Bases. Morgan-Kaufmann, San Mateo, Calif., 203--214.
[10]
Chen, J., DeWitt, D. J., Tian, F., and Wang, Y. 2000. Niagaracq: A scalable continuous query system for internet databases. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. ACM, New York, 379--390.
[11]
Chomicki, J. 1995. Efficient checking of temporal integrity constraints using bounded history encoding. ACM Trans. Datab. Syst. 20, 2, 149--186.
[12]
Cranor, C., Johnson, T., Spataschek, O., and Shkapenyuk, V. 2003. Gigascope: A stream database for network applications. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. ACM, New York, 647--651.
[13]
Datar, M., Gionis, A., Indyk, P., and Motwani, R. 2002. Maintaining stream statistics over sliding windows. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM, New York, 635--644.
[14]
Dobra, A., Garofalakis, M. N., Gehrke, J., and Rastogi, R. 2002. Processing complex aggregate queries over data streams. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data. ACM, New York, 61--72.
[15]
Feigenbaum, J., Kannan, S., Strauss, M., and Viswanathan, M. 1999. An approximate l1-difference algorithm for massive data streams. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science. IEEE Computer Society Press, Los Alamitos, Calif., 501--511.
[16]
Gehrke, J. 2003. Special issue on data stream processing. IEEE Comput. Soc. Bull. Tech. Comm. Data Eng. 26, 1 (Mar.).
[17]
Gehrke, J., Korn, F., and Srivastava, D. 2001. On computing correlated aggregates over continual data streams. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data. ACM, New York, 13--24.
[18]
Gilbert, A. C., Guha, S., Indyk, P., Muthukrishnan, S., and Strauss, M. 2002. Fast, small-space algorithms for approximate histogram maintenance. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing. ACM, New York, 389--398.
[19]
Golab, L. and Ozsu, M. T. 2003. Issues in data stream management. SIGMOD Record 32, 2 (June), 5--14.
[20]
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, A., and Venkatrao, M. 1997. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Min. Knowl. Disc. 1, 1 (Mar.), 29--53.
[21]
Greenwald, M. and Khanna, S. 2001. Space-efficient online computation of quantile summaries. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data. ACM, New York, 58--66.
[22]
Guha, S., Koudas, N., and Shim, K. 2001. Data-streams and histograms. In Proceedings of the 33rd Annual ACM Symposium on Theory of Computing. ACM, New York, 471--475.
[23]
Madden, S., Shah, M. A., Hellerstein, J. M., and Raman, V. 2002. Continuously adaptive continuous queries over streams. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data. ACM, New York, 49--60.
[24]
Manku, G. S., Rajagopalan, S., and Lindsay, B. G. 1999. Random sampling techniques for space efficient online computation of order statistics of large datasets. In Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data. ACM, New York, 251--262.
[25]
Shah, M., Hellerstein, J. M., Chandrasekharan, S., and Franklin, M. J. 2003. Flux: An adaptive partitioning operator for continuous query systems. In Proceedings of the 19th International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, Calif.
[26]
STREAM. 2003. Stanford stream data management project. http://www-db.stanford.edu/stream.
[27]
Terry, D. B., Goldberg, D., Nichols, D., and Oki, B. M. 1992. Continuous queries over append-only databases. In Proceedings of the 1992 ACM SIGMOD International Conference on Management of Data. ACM, New York, 321--330.
[28]
Traderbot. 2003. Traderbot home page. http://www.traderbot.com.
[29]
Tucker, P. A., Maier, D., Sheard, T., and Fegaras, L. 2003. Exploiting punctuation semantics in continuous data streams. IEEE Trans. Knowl. Data Eng. 15, 3, 555--568.
[30]
Ullman, J. D. 1988. Principles of Database and Knowledge-Base Systems. Vol. I. Computer Science Press, Rockville, Md., Chapter 3, 121--122.
[31]
Ullman, J. D. 1989. Principles of Database and Knowledge-Base Systems. Vol. II. Computer Science Press, Rockville, Md., Chapter 14, 892--893.
[32]
Vitter, J. 1985. Random sampling with a reservoir. ACM Trans. Math. Softw. 11, 1 (Mar.), 37--57.

Cited By

View all
  • (2023)Stream reasoning with DatalogMTLJournal of Web Semantics10.1016/j.websem.2023.10077676(100776)Online publication date: Apr-2023
  • (2022)sGrapp: Butterfly Approximation in Streaming GraphsACM Transactions on Knowledge Discovery from Data10.1145/349501116:4(1-43)Online publication date: 8-Jan-2022
  • (2022)When Less Is More: Systematic Analysis of Cascade-Based Community DetectionACM Transactions on Knowledge Discovery from Data10.1145/349456316:4(1-22)Online publication date: 8-Jan-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 29, Issue 1
March 2004
232 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/974750
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 March 2004
Published in TODS Volume 29, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Continuous queries
  2. memory requirement
  3. streams

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)2
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Stream reasoning with DatalogMTLJournal of Web Semantics10.1016/j.websem.2023.10077676(100776)Online publication date: Apr-2023
  • (2022)sGrapp: Butterfly Approximation in Streaming GraphsACM Transactions on Knowledge Discovery from Data10.1145/349501116:4(1-43)Online publication date: 8-Jan-2022
  • (2022)When Less Is More: Systematic Analysis of Cascade-Based Community DetectionACM Transactions on Knowledge Discovery from Data10.1145/349456316:4(1-22)Online publication date: 8-Jan-2022
  • (2022)Distributed Triangle Approximately Counting Algorithms in Simple Graph StreamACM Transactions on Knowledge Discovery from Data10.1145/349456216:4(1-43)Online publication date: 8-Jan-2022
  • (2022)A Trajectory Evaluator by Sub-tracks for Detecting VOT-based Anomalous TrajectoryACM Transactions on Knowledge Discovery from Data10.1145/349003216:4(1-19)Online publication date: 8-Jan-2022
  • (2022)Proactive and intelligent evaluation of big data queries in edge clouds with materialized viewsComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2021.108664203:COnline publication date: 11-Feb-2022
  • (2021)Operationalizing Analytics with NewSQLSoftware Engineering and Algorithms10.1007/978-3-030-77442-4_21(249-263)Online publication date: 20-Jul-2021
  • (2019)On Bounded-Memory Stream Data Processing with Description LogicsDescription Logic, Theory Combination, and All That10.1007/978-3-030-22102-7_30(639-660)Online publication date: 1-Jun-2019
  • (2018)Bounded-Memory Stream ProcessingKI 2018: Advances in Artificial Intelligence10.1007/978-3-030-00111-7_32(377-390)Online publication date: 30-Aug-2018
  • (2018)Data Stream Management Architectures and PrototypesEncyclopedia of Database Systems10.1007/978-1-4614-8265-9_24(854-860)Online publication date: 7-Dec-2018
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media