Abstract
The value of the itemset share is one way of evaluating the magnitude of an itemset. From business perspective, itemset share values reflect more the significance of itemsets for mining association rules in a database. The Share-counted FSM (ShFSM) algorithm is one of the best algorithms which can discover all share-frequent itemsets efficiently. However, ShFSM wastes the computation time on the join and the prune steps of candidate generation in each pass, and generates too many useless candidates. Therefore, this study proposes the Direct Candidates Generation (DCG) algorithm to directly generate candidates without the prune and the join steps in each pass. Moreover, the number of candidates generated by DCG is less than that by ShFSM. Experimental results reveal that the proposed method performs significantly better than ShFSM.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agarwal, R.C., Aggarwal, C.C., Prasad, V.V.V.: A tree projection algorithm for generation of frequent itemsets. Journal of Parallel and Distributed Computing 61, 350–361 (2001)
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. 1993 ACM SIGMOD Intl. Conf. on Management of Data, Washington, D.C., pp. 207–216 (1993)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. 20th Intl. Conf. on Very Large Data Bases, Santiago, Chile, pp. 487–499 (1994)
Barber, B., Hamilton, H.J.: Algorithms for mining share frequent itemsets containing infrequent subsets. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 316–324. Springer, Heidelberg (2000)
Barber, B., Hamilton, H.J.: Parametric algorithm for mining share frequent itemsets. Journal of Intelligent Information Systems 16, 277–293 (2001)
Barber, B., Hamilton, H.J.: Extracting share frequent itemsets with infrequent subsets. Data Mining and Knowledge Discovery 7, 153–185 (2003)
Carter, C.L., Hamilton, H.J., Cercone, N.: Share based measures for itemsets. In: Komorowski, H.J., Zytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 14–24. Springer, Heidelberg (1997)
Chan, R., Yang, Q., Shen, Y.D.: Mining high utility itemsets. In: Proc. 3rd IEEE Intl. Conf. on Data Mining, Melbourne, FL, pp. 19–26 (2003)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent pattern without candidate generation: A frequent pattern tree approach. Data Mining and Knowledge Discovery 8, 53–87 (2004)
Hilderman, R.J., Carter, C.L., Hamilton, H.J., Cercone, N.: Mining association rules from market basket data using share measures and characterized itemsets. Intl. Journal of Artificial Intelligence Tools 7, 189–220 (1998)
Kantardzic, M.: Data mining: Concepts, models, methods, and algorithms. John Wiley & Sons, Inc., New York (2002)
Li, Y.C., Yeh, J.S., Chang, C.C.: A fast algorithm for mining share-frequent itemsets. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds.) APWeb 2005. LNCS, vol. 3399, pp. 417–428. Springer, Heidelberg (2005)
Li, Y.C., Yeh, J.S., Chang, C.C.: Efficient algorithms for mining share-frequent itemsets. In: Proc. 11th World Congress of Intl. Fuzzy Systems Association (2005) (to appear )
Liu, J., Pan, Y., Wang, K., Han, J.: Mining frequent item sets by opportunistic projection. In: Proc. 8th ACM-SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, Alberta, Canada, pp. 229–238 (2002)
Park, J.S., Chen, M.S., Yu, P.S.: An effective hash-based algorithm for mining association rules. In: Proc. 1995 ACM-SIGMOD Intl. Conf. on Management of Data, San Jose, CA, pp. 175–186 (1995)
Wang, K., Zhou, S., Han, J.: Profit mining: From patterns to actions. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 70–88. Springer, Heidelberg (2002)
Zheng, Z., Kohavi, R., Mason, L.: Real world performance of association rule algorithm. In: Proc. 7th ACM-SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, pp. 401–406 (2001)
http://alme1.almaden.ibm.com/software/quest/Resources/datasets/syndata.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, YC., Yeh, JS., Chang, CC. (2005). Direct Candidates Generation: A Novel Algorithm for Discovering Complete Share-Frequent Itemsets. In: Wang, L., Jin, Y. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2005. Lecture Notes in Computer Science(), vol 3614. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11540007_67
Download citation
DOI: https://doi.org/10.1007/11540007_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28331-7
Online ISBN: 978-3-540-31828-6
eBook Packages: Computer ScienceComputer Science (R0)