[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Requirements for clustering data streams

Published: 01 January 2002 Publication History

Abstract

Scientific and industrial examples of data streams abound in astronomy, telecommunication operations, banking and stock-market applications, e-commerce and other fields. A challenge imposed by continuously arriving data streams is to analyze them and to modify the models that explain them as new data arrives. In this paper, we analyze the requirements needed for clustering data streams. We review some of the latest algorithms in the literature and assess if they meet these requirements.

References

[1]
Barbará D., and Chen, P. Using the Fractal Dimension to Cluster Datasets. Proceedings of the ACM-SIGKDD International Conference on Knowledge and Data Mining, Boston, August 2000.
[2]
Barbará D., and Chen, P. Tracking Clusters in Evolving Data Sets. Proceedings of FLAIRS'2001, Special Track on Knowledge Discovery and Data Mining, Key West, May 2001.
[3]
Chernoff, H. A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the Sum of Observations. Annals of Mathematical Statistics, Vol. 23, pages 493-509, 1952.
[4]
Fisher D. H. Iterative Optimization and Simplification of Hierarchical Clusterings. Journal of AI Research, Vol. 4, pages 147-180, 1996.
[5]
Guha S., Mishra N., Motwani R., and O'Callaghan L. Clustering data streams. Proceedings of the Annual Symposium on Foundations of Computer Science November 2000.
[6]
Gluck M. A., and Corter J. E. Information, uncertainty, and the utility of categories. Proceedings of the Seventh Annual Conference of the Cognitive Science Society, Irvine, CA, 1985.
[7]
Schroeder M. Fractal, Chaos, Power Laws: Minutes from an Infinite Paradise. W.H. Freeman and Company, 1991.
[8]
Sheikholeslami G., Chatterjee S., and Zhang A. WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases. Proceedings of the 24th Very Large Data Bases Conference, 1998.
[9]
Traina, A., Traina, C., Papadimitriou S., and Faloutsos, C. Tri-Plots: Scalable Tools for Multidimensional Data Mining. Proceedings of the 7th ACM SIGKDD International Conference on Knowledge and Data Mining, San Francisco, August 2001.
[10]
O'Callaghan L., Mishra N., Meyerson A., Guha S., and Motwani R. High-Performance Clustering of Streams and Large Data Sets. International Conference on Data Engineering (ICDE) 2002 (to appear).
[11]
Watanabe, O. Simple Sampling Techniques for Discovery Science. IEICE Transactions on Inf. & Syst., Vol. E83-D, No. 1, January, 2000.
[12]
Zhang T., Ramakrishnan R., and Livny M. "BIRCH: A Efficient Data Clustering Method for Very Large Databases. Proceedings of the ACM SIGMOD Conference on Management of Data, Montreal, Canada, 1996.

Cited By

View all
  • (2024)TWStream: Three-Way Stream ClusteringIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2024.336971632:9(4927-4939)Online publication date: 1-Sep-2024
  • (2023)Landslide Susceptibility Mapping Using DIvisive ANAlysis (DIANA) and RObust Clustering Using linKs (ROCK) Algorithms, and Comparison of Their PerformanceSustainability10.3390/su1505421815:5(4218)Online publication date: 26-Feb-2023
  • (2023)Data Stream Clustering: An In-depth Empirical StudyProceedings of the ACM on Management of Data10.1145/35893071:2(1-26)Online publication date: 20-Jun-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGKDD Explorations Newsletter
ACM SIGKDD Explorations Newsletter  Volume 3, Issue 2
January 2002
81 pages
ISSN:1931-0145
EISSN:1931-0153
DOI:10.1145/507515
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2002
Published in SIGKDD Volume 3, Issue 2

Check for updates

Author Tags

  1. Data streams
  2. clustering
  3. outliers
  4. tracking changing models

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)TWStream: Three-Way Stream ClusteringIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2024.336971632:9(4927-4939)Online publication date: 1-Sep-2024
  • (2023)Landslide Susceptibility Mapping Using DIvisive ANAlysis (DIANA) and RObust Clustering Using linKs (ROCK) Algorithms, and Comparison of Their PerformanceSustainability10.3390/su1505421815:5(4218)Online publication date: 26-Feb-2023
  • (2023)Data Stream Clustering: An In-depth Empirical StudyProceedings of the ACM on Management of Data10.1145/35893071:2(1-26)Online publication date: 20-Jun-2023
  • (2022)ESA-Stream: Efficient Self-Adaptive Online Data Stream ClusteringIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.299019634:2(617-630)Online publication date: 1-Feb-2022
  • (2022)A Systematic Review of Density Grid-Based Clustering for Data StreamsIEEE Access10.1109/ACCESS.2021.313470410(579-596)Online publication date: 2022
  • (2021)A Clustering Algorithm in Stream Data Using Strong CoresetJournal of Interconnection Networks10.1142/S021926592143011822:Supp02Online publication date: 6-Dec-2021
  • (2021)ESA-Stream: Efficient Self-Adaptive Online Data Stream Clustering (Extended Abstract)2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00250(2329-2330)Online publication date: Apr-2021
  • (2021)Survey: Clustering Techniques of Data Stream2021 1st Babylon International Conference on Information Technology and Science (BICITS)10.1109/BICITS51482.2021.9509923(113-119)Online publication date: 28-Apr-2021
  • (2021)Uncertain Big Data Stream ClusteringCyber-Physical Systems10.1007/978-3-030-67892-0_29(361-372)Online publication date: 14-Apr-2021
  • (2020)Density-Based Clustering Method for Trends Analysis Using Evolving Data StreamInternational Journal of Synthetic Emotions10.4018/IJSE.202007010211:2(19-36)Online publication date: 1-Jul-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media