Abstract
We proposed a new algorithm SuffixMiner which eliminates the requirement of multiple passes through the data when finding out all frequent itemsets in data streams, takes full advantage of the special property of suffix-tree to avoid generating candidate itemsets and traversing each suffix-tree during the itemset growth, and utilizes a new itemset growth method to mine all frequent itemsets in data streams. Experiment results show that the SuffixMiner algorithm not only has an excellent scalability to mine frequent itemsets over data streams, but also outperforms Apriori and Fp-Growth algorithms.
This work was supported by the Natural Science Foundation of China (Grant No. 60433020) and the Key Science-Technology Project of the National Education Ministry of China (Grant No. 02090).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Manku, G.S., Motwani, R.: Approximate Frequency Counts Over Data Streams. In: Proceeding of the International Conference on Very Large Data Bases, Hong Kong, China, pp. 346–357 (2002)
Agrawal, R., Srikant, R.: Fast Algorithms for mining Association Rules. In: Proceeding of the International Conference on Very Large Data Bases, Santiago de Chile, Chile, pp. 487–499 (1994)
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Next Generation Data Mining, Ch. 3, pp. 191–211 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jia, L., Zhou, C., Wang, Z., Xu, X. (2005). SuffixMiner: Efficiently Mining Frequent Itemsets in Data Streams by Suffix-Forest. In: Wang, L., Jin, Y. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2005. Lecture Notes in Computer Science(), vol 3614. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11540007_72
Download citation
DOI: https://doi.org/10.1007/11540007_72
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28331-7
Online ISBN: 978-3-540-31828-6
eBook Packages: Computer ScienceComputer Science (R0)