[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

FINGERPRINT: Summarizing Cluster Evolution in Dynamic Environments

Published: 01 July 2012 Publication History

Abstract

Monitoring and interpretation of changing patterns is a task of paramount importance for data mining applications in dynamic environments. While there is much research in adapting patterns in the presence of drift or shift, there is less research on how to maintain an overview of pattern changes over time. A major challenge is summarizing changes in an effective way, so that the nature of change can be understood by the user, while the demand on resources remains low. To this end, the authors propose FINGERPRINT, an environment for the summarization of cluster evolution. Cluster changes are captured into an "evolution graph," which is then summarized based on cluster similarity into a fingerprint of evolution by merging similar clusters. The authors propose a batch summarization method that traverses and summarizes the Evolution Graph as a whole and an incremental method that is applied during the process of cluster transition discovery. They present experiments on different data streams and discuss the space reduction and information preservation achieved by the two methods.

References

[1]
Aggarwal, C. C. 2005. On change diagnosis in evolving data streams. IEEE Transactions on Knowledge and Data Engineering, 175, 587-600.
[2]
Aggarwal, C. C., Han, J., Wang, J.,&Yu, P. 2003. A framework for clustering evolving data streams. In Proceedings of the 29th International Conference on Very Large Data Bases Vol. 29, pp. 81-92. New York, NY: ACM.
[3]
Aggarwal, C. C.,&Yu, P. 2006. A framework for clustering massive text and categorical data streams. In Proceedings of the Sixth SIAM International Conference on Data Mining Vol. 124, pp. 479-483. Philadelphia, PA: Society for Industrial Mathematics.
[4]
Al-Mulla, R.,&Al Aghbari, Z. 2011. Incremental algorithm for discovering frequent subsequences in multiple data streams. International Journal of Data Warehousing and Mining, 74, 1-20.
[5]
Bartolini, I., Ciaccia, P., Ntoutsi, I., Patella, M.,&Theodoridis, Y. 2004. A unified and flexible framework for comparing simple and complex patterns. In Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases pp. 496-499. New York, NY: Springer.
[6]
Cao, F., Ester, M., Qian, W.,&Zhou, A. 2006. Density-based clustering over an evolving data stream with noise. In Proceedings of the Sixth SIAM International Conference on Data Mining Vol. 124, pp. 479-483. Philadelphia, PA: Society for Industrial Mathematics.
[7]
Chandola, V.,&Kumar, V. 2005. Summarization - compressing data into an informative representation. In Proceedings of the Fifth IEEE International Conference on Data Mining pp. 98-105. Washington, DC: IEEE Computer Society.
[8]
Chen, Y.,&Tu, L. 2007. Density-based clustering for real-time stream data. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining pp. 133-142. New York, NY: ACM.
[9]
Farnstrom, F., Lewis, J.,&Elkan, C. 2000. Scalability for clustering algorithms revisited. SIGKDD Explorations, 21, 51-57.
[10]
Gama, J. 2010. Knowledge discovery from data streams. Boca Raton, FL: CRC Press.
[11]
Ipeirotis, P., Ntoulas, A.,&Gravano, L. 2005. Modeling and managing content changes in text databases. In Proceedings of the 21st International Conference on Data Engineering pp. 606-617. Washington, DC: IEEE Computer Society.
[12]
Kalnis, P., Mamoulis, N.,&Bakiras, S. 2005. On discovering moving clusters in spatio-temporal data. In Proceedings of the 9th International Conference on Advances in Spatial and Temporal Databases pp. 364-381. Berlin, Germany: Springer-Verlag.
[13]
Mei, Q.,&Zhai, C. 2005. Discovering evolutionary theme patterns from text: An exploration of temporal text mining. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining pp. 198-207. New York, NY: ACM.
[14]
Spiliopoulou, M. 2011. Evolution in social networks: A survey. In Aggarwal, C. Ed., Social network data analytics pp. 147-173. Boston, MA: Kluwer Academic.
[15]
Spiliopoulou, M., Ntoutsi, I., Theodoridis, Y.,&Schult, R. 2006. MONIC: Modeling and monitoring cluster transitions. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining pp. 706-711. New York, NY: ACM.
[16]
Yang, H., Parthasarathy, S.,&Mehta, S. 2005. A generalized framework for mining spatio-temporal patterns in scientific data. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining pp. 716-721. New York, NY: ACM.
[17]
Zhang, T., Ramakrishnan, R.,&Livny, M. 1996. BIRCH: An efficient data clustering method for very large databases. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining pp. 103-114. New York, NY: ACM.

Cited By

View all
  • (2024)Predicting Co-movement patterns in mobility dataGeoinformatica10.1007/s10707-022-00478-x28:2(221-243)Online publication date: 1-Apr-2024
  • (2021)MaSEC: Discovering Anchorages and Co-movement Patterns on Streaming Vessel TrajectoriesProceedings of the 17th International Symposium on Spatial and Temporal Databases10.1145/3469830.3470909(170-173)Online publication date: 23-Aug-2021
  • (2015)Discovering and monitoring product features and the opinions on them with OPINSTREAMNeurocomputing10.1016/j.neucom.2014.04.079150:PA(318-330)Online publication date: 20-Feb-2015
  1. FINGERPRINT: Summarizing Cluster Evolution in Dynamic Environments

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image International Journal of Data Warehousing and Mining
    International Journal of Data Warehousing and Mining  Volume 8, Issue 3
    July 2012
    61 pages
    ISSN:1548-3924
    EISSN:1548-3932
    Issue’s Table of Contents

    Publisher

    IGI Global

    United States

    Publication History

    Published: 01 July 2012

    Author Tags

    1. Change Detection
    2. Change Monitoring
    3. Change Summarization
    4. Cluster Evolution
    5. Cluster Summarization
    6. Data Streams
    7. Dynamic Environments

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 30 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Predicting Co-movement patterns in mobility dataGeoinformatica10.1007/s10707-022-00478-x28:2(221-243)Online publication date: 1-Apr-2024
    • (2021)MaSEC: Discovering Anchorages and Co-movement Patterns on Streaming Vessel TrajectoriesProceedings of the 17th International Symposium on Spatial and Temporal Databases10.1145/3469830.3470909(170-173)Online publication date: 23-Aug-2021
    • (2015)Discovering and monitoring product features and the opinions on them with OPINSTREAMNeurocomputing10.1016/j.neucom.2014.04.079150:PA(318-330)Online publication date: 20-Feb-2015

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media