[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Direct Candidates Generation: A Novel Algorithm for Discovering Complete Share-Frequent Itemsets

  • Conference paper
Fuzzy Systems and Knowledge Discovery (FSKD 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3614))

Included in the following conference series:

Abstract

The value of the itemset share is one way of evaluating the magnitude of an itemset. From business perspective, itemset share values reflect more the significance of itemsets for mining association rules in a database. The Share-counted FSM (ShFSM) algorithm is one of the best algorithms which can discover all share-frequent itemsets efficiently. However, ShFSM wastes the computation time on the join and the prune steps of candidate generation in each pass, and generates too many useless candidates. Therefore, this study proposes the Direct Candidates Generation (DCG) algorithm to directly generate candidates without the prune and the join steps in each pass. Moreover, the number of candidates generated by DCG is less than that by ShFSM. Experimental results reveal that the proposed method performs significantly better than ShFSM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agarwal, R.C., Aggarwal, C.C., Prasad, V.V.V.: A tree projection algorithm for generation of frequent itemsets. Journal of Parallel and Distributed Computing 61, 350–361 (2001)

    Article  MATH  Google Scholar 

  2. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. 1993 ACM SIGMOD Intl. Conf. on Management of Data, Washington, D.C., pp. 207–216 (1993)

    Google Scholar 

  3. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. 20th Intl. Conf. on Very Large Data Bases, Santiago, Chile, pp. 487–499 (1994)

    Google Scholar 

  4. Barber, B., Hamilton, H.J.: Algorithms for mining share frequent itemsets containing infrequent subsets. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 316–324. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  5. Barber, B., Hamilton, H.J.: Parametric algorithm for mining share frequent itemsets. Journal of Intelligent Information Systems 16, 277–293 (2001)

    Article  MATH  Google Scholar 

  6. Barber, B., Hamilton, H.J.: Extracting share frequent itemsets with infrequent subsets. Data Mining and Knowledge Discovery 7, 153–185 (2003)

    Article  MathSciNet  Google Scholar 

  7. Carter, C.L., Hamilton, H.J., Cercone, N.: Share based measures for itemsets. In: Komorowski, H.J., Zytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 14–24. Springer, Heidelberg (1997)

    Google Scholar 

  8. Chan, R., Yang, Q., Shen, Y.D.: Mining high utility itemsets. In: Proc. 3rd IEEE Intl. Conf. on Data Mining, Melbourne, FL, pp. 19–26 (2003)

    Google Scholar 

  9. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent pattern without candidate generation: A frequent pattern tree approach. Data Mining and Knowledge Discovery 8, 53–87 (2004)

    Article  MathSciNet  Google Scholar 

  10. Hilderman, R.J., Carter, C.L., Hamilton, H.J., Cercone, N.: Mining association rules from market basket data using share measures and characterized itemsets. Intl. Journal of Artificial Intelligence Tools 7, 189–220 (1998)

    Article  Google Scholar 

  11. Kantardzic, M.: Data mining: Concepts, models, methods, and algorithms. John Wiley & Sons, Inc., New York (2002)

    Google Scholar 

  12. Li, Y.C., Yeh, J.S., Chang, C.C.: A fast algorithm for mining share-frequent itemsets. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds.) APWeb 2005. LNCS, vol. 3399, pp. 417–428. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  13. Li, Y.C., Yeh, J.S., Chang, C.C.: Efficient algorithms for mining share-frequent itemsets. In: Proc. 11th World Congress of Intl. Fuzzy Systems Association (2005) (to appear )

    Google Scholar 

  14. Liu, J., Pan, Y., Wang, K., Han, J.: Mining frequent item sets by opportunistic projection. In: Proc. 8th ACM-SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, Alberta, Canada, pp. 229–238 (2002)

    Google Scholar 

  15. Park, J.S., Chen, M.S., Yu, P.S.: An effective hash-based algorithm for mining association rules. In: Proc. 1995 ACM-SIGMOD Intl. Conf. on Management of Data, San Jose, CA, pp. 175–186 (1995)

    Google Scholar 

  16. Wang, K., Zhou, S., Han, J.: Profit mining: From patterns to actions. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 70–88. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  17. Zheng, Z., Kohavi, R., Mason, L.: Real world performance of association rule algorithm. In: Proc. 7th ACM-SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, pp. 401–406 (2001)

    Google Scholar 

  18. http://alme1.almaden.ibm.com/software/quest/Resources/datasets/syndata.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, YC., Yeh, JS., Chang, CC. (2005). Direct Candidates Generation: A Novel Algorithm for Discovering Complete Share-Frequent Itemsets. In: Wang, L., Jin, Y. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2005. Lecture Notes in Computer Science(), vol 3614. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11540007_67

Download citation

  • DOI: https://doi.org/10.1007/11540007_67

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28331-7

  • Online ISBN: 978-3-540-31828-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics