[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Constrained frequent pattern mining: a pattern-growth view

Published: 01 June 2002 Publication History

Abstract

It has been well recognized that frequent pattern mining plays an essential role in many important data mining tasks. However, frequent pattern mining often generates a very large number of patterns and rules, which reduces not only the efficiency but also the effectiveness of mining. Recent work has highlighted the importance of the constraint-based mining paradigm in the context of mining frequent itemsets, associations, correlations, sequential patterns, and many other interesting patterns in large databases.Recently, we developed efficient pattern-growth methods for frequent pattern mining. Interestingly, pattern-growth methods are not only efficient but also effective in mining with various constraints. Many tough constraints which cannot be handled by previous methods can be pushed deep into the pattern-growth mining process. In this paper, we overview the principles of pattern-growth methods for constrained frequent pattern mining and sequential pattern mining. Moreover, we explore the power of pattern-growth methods towards mining with tough constraints and highlight some interesting open problems.

References

[1]
R. Agarwal, C. Aggarwal, and V. V. V. Prasad. Depth-first generation of large itemsets for association rules. In IBM Technical Report RC21538, July 1999.
[2]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases (VLDB'94), pages 487-499, Santiago, Chile, Sept. 1994.
[3]
R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. 1995 Int. Conf. Data Engineering (ICDE'95), pages 3-14, Taipei, Taiwan, Mar. 1995.
[4]
R. J. Bayardo. Efficiently mining long patterns from databases. In Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'98), pages 85-93, Seattle, WA, June 1998.
[5]
K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. In Proc. 1999 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'99), pages 359-370, Philadelphia, PA, June 1999.
[6]
S. Brin, R. Motwani, and C. Silverstein. Beyond market basket: Generalizing association rules to correlations. In Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'97), pages 265-276, Tucson, Arizona, May 1997.
[7]
D. Burdick, M. Calimlim, and J. Gehrke. MAFIA: A maximal frequent itemset algorithm for transactional databases. In Proc. 2001 Int. Conf. Data Engineering (ICDE'01), pages 443-452, Heidelberg, Germany, April 2001.
[8]
G. Dong and J. Li. Efficient mining of emerging patterns: Discovering trends and differences. In Proc. 1999 Int. Conf. Knowledge Discovery and Data Mining (KDD'99), pages 43-52, San Diego, CA, Aug. 1999.
[9]
G. Grahne, L. Lakshmanan, and X. Wang. Efficient mining of constrained correlated sets. In Proc. 2000 Int. Conf. Data Engineering (ICDE'00), pages 512-521, San Diego, CA, Feb. 2000.
[10]
S. Guha, R. Rastogi, and K. Shim. Rock: A robust clustering algorithm for categorical attributes. In Proc. 1999 Int. Conf. Data Engineering (ICDE'99), pages 512-521, Sydney, Australia, Mar. 1999.
[11]
J. Han, G. Dong, and Y. Yin. Efficient mining of partial periodic patterns in time series database. In Proc. 1999 Int. Conf. Data Engineering (ICDE'99), pages 106-115, Sydney, Australia, April 1999.
[12]
J. Han and J. Pei. Mining frequent patterns by pattern-growth: Methodology and implications. SIGKDD Explorations (Special Issue on Scalable Data Mining Algorithms), 2:14-20, 2000.
[13]
J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, and M.-C. Hsu. FreeSpan: Frequent pattern-projected sequential pattern mining. In Proc. 2000 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD'00), pages 355-359, Boston, MA, Aug. 2000.
[14]
J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'00), pages 1-12, Dallas, TX, May 2000.
[15]
M. Kamber, J. Han, and J. Y. Chiang. Metarule-guided mining of multi-dimensional association rules using data cubes. In Proc. 1997 Int. Conf. Knowledge Discovery and Data Mining (KDD'97), pages 207-210, Newport Beach, CA, Aug. 1997.
[16]
M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and A. Verkamo. Finding interesting rules from large sets of discovered association rules. In Proc. 3rd Int. Conf. Information and Knowledge Management, pages 401-408, Gaithersburg, Maryland, Nov. 1994.
[17]
L. V. S. Lakshmanan, R. Ng, J. Han, and A. Pang. Optimization of constrained frequent set queries with 2-variable constraints. In Proc. 1999 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'99), pages 157-168, Philadelphia, PA, June 1999.
[18]
B. Lent, A. Swami, and J. Widom. Clustering association rules. In Proc. 1997 Int. Conf. Data Engineering (ICDE'97), pages 220-231, Birmingham, England, April 1997.
[19]
B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In Proc. 1998 Int. Conf. Knowledge Discovery and Data Mining (KDD'98), pages 80-86, New York, NY, Aug. 1998.
[20]
H. Mannila, H. Toivonen, and A. I. Verkamo. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1:259-289, 1997.
[21]
R. Ng, L. V. S. Lakshmanan, J. Han, and A. Pang. Exploratory mining and pruning optimizations of constrained associations rules. In Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'98), pages 13-24, Seattle, WA, June 1998.
[22]
J. Park, M. Chen, and P. Yu. An effective hash-based algorithm for mining association rules. In Proc. 1995 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD '95), pages 175-186, San Jose, CA, May 1995.
[23]
J. Pei and J. Han. Can we push more constraints into frequent pattern mining? In Proc. 2000 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD'00), pages 350-354, Boston, MA, Aug. 2000.
[24]
J. Pei, J. Han, and L. V. S. Lakshmanan. Mining frequent itemsets with convertible constraints. In Proc. 2001 Int. Conf. Data Engineering (ICDE'01), pages 433-332, Heidelberg, Germany, April 2001.
[25]
J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang. H-Mine: Hyper-structure mining of frequent patterns in large databases. In Proc. 2001 Int. Conf. Data Mining (ICDM'01), pages 441-448, San Jose, CA, Nov. 2001.
[26]
J. Pei, J. Han, and R. Mao. CLOSET: An efficient algorithm for mining frequent closed itemsets. In Proc. 2000 ACM-SIGMOD Int. Workshop Data Mining and Knowledge Discovery (DMKD'00), pages 11-20, Dallas, TX, May 2000.
[27]
J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proc. 2001 Int. Conf. Data Engineering (ICDE'01), pages 215-224, Heidelberg, Germany, April 2001.
[28]
J. Pei, J. Han, and W. Wang. Constraint-based sequential pattern mining in large databases. In Submitted for publication, May 2002.
[29]
S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with relational database systems: Alternatives and implications. In Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'98), pages 343-354, Seattle, WA, June 1998.
[30]
A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. In Proc. 1995 Int. Conf. Very Large Data Bases (VLDB'95), pages 432-443, Zurich, Switzerland, Sept. 1995.
[31]
C. Silverstein, S. Brin, R. Motwani, and J. Ullman. Scalable techniques for mining causal structures. In Proc. 1998 Int. Conf. Very Large Data Bases (VLDB'98), pages 594-605, New York, NY, Aug. 1998.
[32]
R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. In Proc. 1997 Int. Conf. Knowledge Discovery and Data Mining (KDD'97), pages 67-73, Newport Beach, CA, Aug. 1997.
[33]
M. J. Zaki and C. J. Hsiao. CHARM: An efficient algorithm for closed itemset mining. In Proc. 2002 SIAM Int. Conf. Data Mining, pages 457-473, Arlington, VA, April 2002.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGKDD Explorations Newsletter
ACM SIGKDD Explorations Newsletter  Volume 4, Issue 1
June 2002
75 pages
ISSN:1931-0145
EISSN:1931-0153
DOI:10.1145/568574
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2002
Published in SIGKDD Volume 4, Issue 1

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)2
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)High utility itemsets mining from transactional databases: a surveyApplied Intelligence10.1007/s10489-023-04853-553:22(27655-27703)Online publication date: 16-Sep-2023
  • (2022)Flexibly Mining Better Patterns2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10021132(6212-6221)Online publication date: 17-Dec-2022
  • (2021)A Survey of Research on Data Analytics-Based Legal TechSustainability10.3390/su1314808513:14(8085)Online publication date: 20-Jul-2021
  • (2021)Similarity Association Pattern Mining in Transaction DatabasesInternational Conference on Data Science, E-learning and Information Systems 202110.1145/3460620.3460752(180-184)Online publication date: 5-Apr-2021
  • (2021)HyperNOMADACM Transactions on Mathematical Software10.1145/345097547:3(1-27)Online publication date: 26-Jun-2021
  • (2021)NEPACM Transactions on Mathematical Software10.1145/344754447:3(1-29)Online publication date: 26-Jun-2021
  • (2021)LinneaACM Transactions on Mathematical Software10.1145/344663247:3(1-26)Online publication date: 26-Jun-2021
  • (2021)PROLISEANACM Transactions on Internet Technology10.1145/343225021:1(1-29)Online publication date: 13-Jan-2021
  • (2021)Beyond FrequencyACM Transactions on Internet Technology10.1145/342549821:1(1-32)Online publication date: 5-Jan-2021
  • (2020)Low-cost Security for Next-generation IoT NetworksACM Transactions on Internet Technology10.1145/340628020:3(1-31)Online publication date: 5-Sep-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media