Efficient discovery of correlated patterns using multiple minimum all-confidence thresholds

Uday Kiran Rage¹ &
Masaru Kitsuregawa^2,3

364 Accesses
7 Citations
Explore all metrics

Abstract

Correlated patterns are an important class of regularities that exist in a database. Although there exists no universally acceptable best measure to judge the interestingness of a pattern, all-confidence is emerging as a popular measure to discover the patterns. It is because the measure satisfies both the anti-monotonic and null-invariance properties. The former property makes the pattern mining practicable in real-world applications. The latter property facilitates the user to discover the patterns involving both frequent and rare items without generating the huge number of patterns. In this paper, we show that though the measure satisfies the null-invariance property, mining the patterns containing both frequent and rare items with a single minimum all-confidence (minAllConf) threshold leads to the dilemma known as “rare item problem.” At a high minAllConf, the discovered correlated patterns involving rare items have very short length. At a low minAllConf, combinatorial explosion can occur, producing too many patterns. To confront the problem, the paper introduces an alternative model based on the concept of multiple minAllConf thresholds. The proposed model generalizes the existing model of correlated patterns and facilitates the user to specify a different minAllConf for each pattern depending upon its items’ frequencies. A pattern-growth algorithm, called GCoMine, has also been proposed to discover the patterns. Experiment results show that GCoMine is efficient, and the proposed model can address the problem effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Pattern-Growth Methods

Discovering Periodic-Correlated Patterns in Temporal Databases

Interesting Patterns

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Agrawal, R., Imieliński, T., Swami, A. (1993). Mining association rules between sets of items in large databases. In SIGMOD (pp. 207–216).
Agrawal, R., & Srikanth, R. (1994). Fast algorithms for mining association rules in large databases. In VLDB (pp. 487–499).
Brin, S., Motwani, R., Silverstein, C. (1997). Beyond market baskets: generalizing association rules to correlations. SIGMOD Rec, 26, 265–276.
Article Google Scholar
Brijs, T., Goethals, B., Swinnen, G., Vanhoof, K., Wets, G. (2000). A data mining framework for optimal product selection in retail supermarket data: the generalized PROFSET model. In KDD (pp. 300–304).
Cohen, H., West, S.G., Cohen, P., Aiken, L. (2002). Applied multiple regression correlation analysis for the behavioral sciences, 3rd edn. Lawrence Erlbaum Assoc Inc.
Gedikli, F., & Jannach, D. (2010). Neighborhood-restricted mining and weighted application of association rules for recommenders. In International conference on web information system engineering, (pp. 157–165).
Han, J., Pei, J., Yin, Y., Mao, R. (2004). Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Mining Knowledge Discovery, 15(1), 55–86.
Article MathSciNet Google Scholar
Han, J., Cheng, H., Xin, D., Yan, X. (2007). Frequent pattern mining: current status and future directions. Data Mining Knowledge Discovery, 15(1), 55–86.
Article MathSciNet Google Scholar
Kim, W.Y., Lee, Y.K., Han, J. (2004). Ccmine: efficient mining of confidence-closed correlated patterns. In PAKDD (pp. 569–579).
Kim, S., Barsky, M., Han, J. (2011). Efficient mining of top correlated patterns based on null invariant measures. In ECML PKDD (pp. 172–192).
Kiran, R.U., & Reddy, P.K. (2011). Novel techniques to reduce search space in multiple minimum supports-based frequent pattern mining algorithms. In EDBT (pp. 11–20).
Kiran, R.U., & Kitsuregawa, M. (2012). Efficient discovery of correlated patterns in transactional databases using items’ support intervals. In DEXA (pp. DEXA (1) 234–248).
Kiran, R.U., & Kitsuregawa, M. (2013). Mining correlated patterns with multiple minimum all-confidence thresholds. In PAKDD-QIMIE (pp. 234–248).
Kubat, M., Holte, R.C., Matwin, S. (1998). Machine learning for the detection of oil spills in satellite radar images. Machine Learning, 30(2), 195–215.
Article Google Scholar
Kuo, P.W., Jenssen, T.K., Butte, A.J., Onno-Machado, L., Kohane, I.S. (2002). Analysis of matched mrna measurements from two different microarray technologies. Bioinformatics, 18(3), 405–412.
Article Google Scholar
Lee, Y.K., Kim, W.Y., Cao, D., Han, J. (2003). CoMine: efficient mining of correlated patterns. In ICDM (pp. 581–584).
Liu, B., Hsu, W., Ma, Y. (1999). Mining association rules with multiple minimum supports. In KDD (pp. 337–341).
Omiecinski, E.R. (2003). Alternative interest measures for mining associations in databases. IEEE Transactions on Knowledge and Data Engineering, 15, 57–69.
Article Google Scholar
Pei, J., Han, J., Lakshmanan, L.V. (2004). Pushing convertible constraints in frequent itemset mining. Data Mining and Knowledge Discovery, 8, 227–251.
Article MathSciNet Google Scholar
Storch, H.V., & Zwiers, F.W. (2002). Statistical analysis in climate research. Cambridge University Press.
Surana, A., Kiran, R.U., Reddy, P.K. (2010). Selecting a right interestingness measure for rare association rules. In COMAD (pp. 115–124).
Tan, P.N., Kumar, V., Srivasta, J. (2002). Selecting the right interestingness measure for association patterns. In KDD (pp. 32–41).
Weiss, G.M. (2004). Mining with rarity: a unifying framework. ACM SIGKDD Explorations Newsletter, 6(1), 7–19.
Article Google Scholar
Wu, T., Chen, Y., Han, J. (2010). Re-examination of interestingness measures in pattern mining: a unified framework. Data Mining Knolwedge Discovery, 21, 371–397.
Article MathSciNet Google Scholar
Xiong, H., He, X., Ding, C.H.Q., Zhang, Y., Kumar, V., Holbrook, S.R. (2005). Identification of functional modules in protein complexes via hyperclique pattern discovery. In Pacific symposium on biocomputing.
Xiong, H., Tan, P.N., Kumar, V. (2006). Hyperclique pattern discovery. Data Mining Knowledge Discovery, 13(2), 219–242.
Article MathSciNet Google Scholar
Yun, H., Ha, D., Hwang, B., Ryu, K.H. (2003). Mining association rules on significant rare data using relative support. Journal of Systems and Software, 67(3), 181–191.
Article Google Scholar
Zhou, Z., Wu, Z., Wang, C., Feng, Y. (2006). Mining both associated and correlated patterns. Computational Science ICCS, 3994, 468–475.
Google Scholar
Zhou, Z., Wu, Z., Wang, C., Feng, Y. (2006). Efficiently mining mutually and positively correlated patterns. Advanced Data Mining and Applications, 4093, 118–125.
Article Google Scholar
Zheng, Z., Kohavi, R., Mason, L. (2001). Real world performance of association rule algorithms. In KDD (pp. 401–406).

Download references

Author information

Authors and Affiliations

Institute of Industrial Science, University of Tokyo, Tokyo, Japan
Uday Kiran Rage
Institute of Industrial Science, University of Tokyo, Tokyo, Japan
Masaru Kitsuregawa
National Institute of Informatics, Tokyo, Japan
Masaru Kitsuregawa

Authors

Uday Kiran Rage
View author publications
You can also search for this author in PubMed Google Scholar
Masaru Kitsuregawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Uday Kiran Rage.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rage, U.K., Kitsuregawa, M. Efficient discovery of correlated patterns using multiple minimum all-confidence thresholds. J Intell Inf Syst 45, 357–377 (2015). https://doi.org/10.1007/s10844-014-0314-7

Download citation

Received: 30 July 2013
Revised: 03 December 2013
Accepted: 17 February 2014
Published: 21 March 2014
Issue Date: December 2015
DOI: https://doi.org/10.1007/s10844-014-0314-7

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Pattern-Growth Methods

Discovering Periodic-Correlated Patterns in Temporal Databases

Interesting Patterns

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Efficient discovery of correlated patterns using multiple minimum all-confidence thresholds

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Pattern-Growth Methods

Discovering Periodic-Correlated Patterns in Temporal Databases

Interesting Patterns

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation