Abstract
High utility pattern mining has been studied as an essential topic in the field of pattern mining in order to satisfy requirements of many real-world applications that need to process non-binary databases including item importance such as market analysis. In this paper, we propose an efficient algorithm with a novel indexed list-based data structure for mining high utility patterns. Previous approaches first generate an enormous number of candidate patterns on the basis of overestimation methods in their mining processes and then identify actual high utility patterns from the candidates through an additional database scan, which leads to high computational overheads. Although several list-based algorithms to discover high utility patterns without candidate generation have been suggested in recent years, they require a large number of comparison operations. Our method facilitates efficient mining of high utility patterns with the proposed indexed list by effectively reducing the total number of such operations. Moreover, we develop two techniques based on this novel data structure to more enhance mining performance of the proposed method. Experimental results on real and synthetic datasets show that the proposed algorithm mines high utility patterns more efficiently than the state-of-the-art algorithms.
Similar content being viewed by others
References
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Bocca JB, Jarke M, Zaniolo C (eds) Proceedings of 20th international conference on very large data bases, Santiago de Chile, Chile, September 1994, pp 487–499
Ahmed CF, Tanbeer SK, Jeong B-S, Choi H-J (2012) Interactive mining of high utility patterns over data streams. Expert Syst Appl 39(15):11979–11991
Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K, Choi H-J (2012) Single-pass incremental and interactive mining for weighted frequent patterns. Expert Syst Appl 39(9):7976–7994
Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721
Chen L, Mei Q (2014) Mining frequent items in data stream using time fading model. Inf Sci 257(1):54–69
Feng L, Wang L, Jin B (2013) UT-tree: efficient mining of high utility itemsets from data streams. Intell Data Anal 17(4):585–602
Fournier-Viger P, Wu C-W, Zida S, Tseng VS (2014) FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen T, Christiansen H, Talavera JCC, Ras ZW (eds) Proceedings of 21st international symposium on methodologies for intelligent systems, Roskilde, Denmark, June 2014. Lecture Notes in Computer Science 8502, Springer, Berlin, pp 83–92
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Chen W, Naughton JF, Bernstein PA (eds) Proceedings of 2000 ACM SIGMOD international conference on management of data, Dallas, Texas, USA, May 2000, pp 1–12
Huynh-Thi-Le Q, Le T, Vo B, Le HB (2015) An efficient and effective algorithm for mining top-rank-k frequent patterns. Expert Syst Appl 42(1):156–164
Kim J, Yun U, Pyun G, Ryang H, Lee G, Yoon E, Ryu K (2015) A blog ranking algorithm using analysis of both blog influence and characteristics of blog posts. Clust Comput 18(1):157–164
Lan GC, Hong TP, Tseng VS, Wang SL (2014) Applying the maximum utility measure in high utility sequential pattern mining. Expert Syst Appl 41(11):5071–5081
Lan GC, Hong TP, Tseng VS (2014) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107
Lee G, Yun U, Ryang H (2015) Mining weighted erasable patterns by using underestimated constraint-based pruning technique. J Intell Fuzzy Syst 28(3):1145–1157
Lee G, Yun U, Ryu K (2014) Sliding window based weighted maximal frequent pattern mining over data streams. Expert Syst Appl 41(2):694–708
Li YC, Yeh J-S, Chang C-C (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64(1):198–217
Lin C-W, Hong T-P, Lan G-C, Wong J-W, Lin W-Y (2015) Efficient updating of discovered high-utility itemsets for transaction deletion in dynamic databases. Adv Eng Inform 29(1):16–27
Lin C-W, Lan G-C, Hong T-P (2015) Mining high utility itemsets for transaction deletion in a dynamic database. Intell Data Anal 19(1):43–55
Lin C-W, Hong T-P, Lan G-C, Wong J-W, Lin W-Y (2014) Incrementally mining high utility patterns based on pre-large concept. Appl Intell 40(2):343–357
Liu M, Qu J-F (2012) Mining high utility itemsets without candidate generation. In: Chen X-W, Lebanon G, Wang H, Zaki MJ (eds) Proceedings of 21st ACM international conference on information and knowledge management (CIKM’12), Maui, HI, USA, October 2012, pp 55–64
Liu J, Wang K, Fung BCM (2012) Direct discovery of high utility itemsets without candidate generation. In: Zaki MJ, Siebes A, Yu JX, Goethals B, Webb GI, Wu X (eds) Proceedings of 12th IEEE international conference on data mining (ICDM’12), Brussels, Belgium, December 2012, pp 984–898
Liu Y, Liao W-K, Choudhary AN (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Ho TB, Cheung DW-L, Liu H (eds) Proceedings of advances in knowledge discovery and data mining, 9th Pacific-Asia conference (PAKDD’05), Hanoi, Vietnam, May 2005, pp 689–695
Pisharath J, Liu Y, Ozisikyilmaz B, Narayanan R, Liao WK, Choudhary A, Memik G (2005) NU-MineBench version 2.0 dataset and technical report. http://cucis.ece.northwestern.edu/projects/DMS/MineBench.html
Pyun G, Yun U (2014) Mining top-k frequent patterns with combination reducing techniques. Appl Intell 41(1):76–98
Pyun G, Yun U, Ryu K (2014) Efficient frequent pattern mining based on linear prefix tree. Knowl Based Syst 55:125–139
Ryang H, Yun U (2015) Top-k high utility pattern mining with effective threshold raising strategies. Knowl Based Syst 76:109–126
Ryang H, Yun U, Ryu K (2014) Discovering high utility itemsets with multiple minimum supports. Intell Data Anal 18(6):1027–1047
Sahoo J, Das AK, Goswami A (2015) An efficient approach for mining association rules from high utility itemsets. Expert Syst Appl 42(13):5754–5778
Song W, Liu Y, Li J (2014) Mining high utility itemsets by dynamically pruning the tree structure. Appl Intell 40(1):29–43
Troiano L, Scibelli G (2014) Mining frequent itemsets in data streams within a time horizon. Data Knowl Eng 89:21–37
Tseng VS, Wu C-W, Fournier-Viger P, Yu PS (2015) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans Knowl Data Eng 27(3):726–739
Tseng VS, Shie B-E, Wu C-W, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786
Tseng VS, Wu C-W, Shie B-E, Yu PS (2010) UP-Growth: an efficient algorithm for high utility itemset mining. In: Rao B, Krishnapuram B, Tomkins A, Yang Q (eds) Proceedings of 16th ACM SIGKDD international conference on knowledge discovery and data mining, DC, USA, Washington, July 2010, pp 253–262
Yun U, Ryang H (2015) Incremental high utility pattern mining with static and dynamic databases. Appl Intell 42(2):323–352
Yun U, Kim J (2015) A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Syst Appl 42(3):1149–1165
Yun U, Ryang H, Ryu K (2014) High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Syst Appl 41(8):3861–3878
Zhang X, Deng Z-H (2015) Mining summarization of high utility itemsets. Knowl Based Syst 84:67–77
ZiHayat M, An A (2014) Mining top-k high utility patterns over data streams. Inf Sci 285:138–161
Acknowledgments
This research was supported by the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF Nos. 20152062051 and 20155054624).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ryang, H., Yun, U. Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques. Knowl Inf Syst 51, 627–659 (2017). https://doi.org/10.1007/s10115-016-0989-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-016-0989-x