Abstract
Due to streaming data are infinite in length and fast changing with time, it is very significant to limit the memory usage in the process of mining data streams. Maximal frequent itemset is a subset of frequent itemsets; it can represent the important information of frequent itemsets with low computational cost. In this paper, we propose an algorithm MMFI_DSSW (Mining Maximal Frequent Itemsets in Data Streams Sliding Window) to mine maximal frequent itemsets with a novel MFI_BVT (Maximal Frequent Itemsets Binary Vector Table) summary data structure in sliding window. MFI_BVT builds a binary vector for each itemsets first. Then algorithm MMFI_DSSW performs logical AND operation to mine all the maximal frequent itemsets in MFI_BVT with a single-pass scan incoming data. Finally, the mining result can be updated incrementally. Experiment shows that algorithm MMFI_DSSW is efficient and scalable in memory usage and running time of CPU.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Li, H, Lee, S, Shan, M: An efficient algorithm for mining frequent itemsets over the entire history of data streams. In: Proceedings of the First International Workshop on Knowledge Discovery in Data Streams, held in conjunction with the 15th European Conference on Machine Learning (ECML 2004) and the 8th European Conference on the Principles and Practice of Knowledge Discovery in Databases (PKDD 2004), Pisa, Italy (2004)
Zhi-jun, X., Hong, C., Li, C.: An Efficient Algorithm for Frequent Itemset Mining on Data Streams. In: Perner, P. (ed.) ICDM 2006. LNCS (LNAI), vol. 4065, pp. 474–491. Springer, Heidelberg (2006)
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Kargupta, H., Joshi, A., Sivakumar, K. (eds.) Next Generation Data Mining, pp. 191–212. MIT Press, Cambridge, Massachusetts (2003)
Lin, C.H., Chiu, D.Y., Wu, Y.H., Chen, A.L.P.: Mining Frequent Itemsets from Data Streams with a Time-Sensitive Sliding Window. In: Proceedings of the Fifth SIAM International on Data Mining, Newport Beach, USA (2005)
Teng, W.G., Chen, M.-S., Yu, P.S.: A Regression-Based Temporal Pattern Mining Scheme for Data Streams. In: Proceedings of the 29th VLDB Conference, pp. 93–104. IEEE Press, Berlin, Germany (2003)
Chang, J.H., Lee, W.S.: Finding Recent Frequent Itemsets Adaptively over Online Data Streams. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 487–492. ACM Press, Washington, DC, USA (2003)
Li, H.-F., Lee, S.-Y., Shan, M.-K.: Online Mining (Recently) Maximal Frequent Itemsets over Data Streams. In: RIDE-SDMA 2005. Proceedings of the 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, pp. 11–18. IEEE Press, Tokyo, Japan (2005)
Lee, D., Lee, W.: Finding maximal frequent itemsets over online data streams adaptively. In: Proceedings of the fifth IEEE InternationalConference on Data Mining, pp. 266–273. IEEE Press, Houston, USA (2005)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the ACM SIGMOD Conference on Management of Data, pp. 1–12. ACM Press, Dallas, USA (2000)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on VLDB, Santiago, Chile, pp. 487–499 (1994)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Feng, J., Ren, J. (2007). MMFI_DSSW – A New Method to Incrementally Mine Maximal Frequent Itemsets in Transaction Sensitive Sliding Window. In: Zhang, Z., Siekmann, J. (eds) Knowledge Science, Engineering and Management. KSEM 2007. Lecture Notes in Computer Science(), vol 4798. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76719-0_47
Download citation
DOI: https://doi.org/10.1007/978-3-540-76719-0_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76718-3
Online ISBN: 978-3-540-76719-0
eBook Packages: Computer ScienceComputer Science (R0)