Abstract
High-utility itemset mining (HUIM) is a useful tool for analyzing customer behavior in the field of data mining. HUIM algorithms can discover the most beneficial itemsets in transaction databases, namely the high-utility itemsets (HUIs), in contrast to frequent itemset mining (FIM) algorithms that rely on detecting frequent patterns. Several algorithms have been proposed to effectively carry out this task, but most of them ignore the categorization of items. In many real-world transaction databases, this helpful information about the categories and subcategories of items, represented as a taxonomy, is useful. Therefore, traditional HUIM algorithms can only discover itemsets at the lowest level of abstraction and leave out several important patterns from higher levels. To address this limitation, this work suggests the use of items taxonomy. Besides, to further enhance the performance of the task several effective pruning techniques are also revised and utilized to tighten the search space when considering the taxonomy of items. To accurately find multi-level HUIs from transaction databases enhanced with taxonomy information, a new algorithm called MLHMiner (Multiple-Level HMiner) is proposed, which is an extended version of the HMiner algorithm. We also prove that the pruning techniques of HMiner can be applied in different abstraction levels to efficiently mine multi-level HUIs. It can be seen from the experimental evaluations on several databases (both real and synthetic) that the designed approach is capable of identifying useful patterns from different abstraction levels with high efficiency.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. ACM SIGMOD Rec 22(2):207–216
Yao H, Hamilton HJ, Butz GJ (2004) A foundational approach to mining itemset utilities from databases. SIAM Intl Conf Data Mining 4:482–486
Srikant R, Agrawal R (1997) Mining generalized association rules. Futur Gener Comput Syst 13(2–3):161–180
Hipp J, Myka A, Wirth R, Güntzer U (1998) A new algorithm for faster mining of generalized association rules. Eur Sympo Princ Data Mining Knowl Disc 1510:74–82
Vo B, Le B (2009) Fast algorithm for mining generalized association rules. Int J Database Theory 2(3):19–21
Cagliero L, Chiusano S, Garza P, Ricupero G (2017) Discovering high-utility itemsets at multiple abstraction levels. Eur Conf Adv Databases Inform Syst 767:224–234
P. Fournier-Viger, Y. Yang, J. C.-W. Lin, J. M. Luna, and S. Ventura, “Mining Cross-Level High Utility Itemsets,” in 33rd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, p. 12, 2020
R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules in Large Databases,” in the 20th International Conference on Very Large Data Bases (VLDB ‘94), pp. 487–499, 1994
Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87
K. Sriphaew and T. Theeramunkong, “A new method for finding generalized frequent itemsets in generalized association rule mining,” in IEEE Symposium on Computers and Communications, pp. 1040–1045, 2002
Appice A, Ceci M, Lanza A, Lisi FA, Malerba D (2003) Discovery of spatial association rules in geo-referenced census data: a relational mining approach. Intell Data Anal 7(6):541–566
A. Appice, M. Berardi, M. Ceci, and D. Malerba, “Mining and Filtering Multi-level Spatial Association Rules with ARES,” in Foundations of Intelligent Systems, pp. 342–353, 2005
Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4):462–478
Wu CM, Huang YF (2011) Generalized association rule mining using an efficient data structure. Expert Syst Appl 38(6):7277–7290
I. Pramudiono and M. Kitsuregawa, “FP-tax: Tree structure based generalized association rule mining,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 60–63, 2004
Baralis E, Cagliero L, Cerquitelli T, Garza P (2012) Generalized association rule mining with constraints. Inf Sci 194:68–84
Han J, Fu Y (1999) Mining multiple-level association rules in large databases. IEEE Trans Knowl Data Eng 11(5):798–805
Lui CL, Chung FL (2000) Discovery of generalized association rules with multiple minimum supports. Eur Conf Princ Data Mining Knowl Disc 1910:510–515
Y. Liu, W. K. Liao, and A. Choudhary, “A two-phase algorithm for fast discovery of high utility itemsets,” in the 9th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, vol. 3518, pp. 689–695, 2005
Tseng VS, Shie BE, Wu CW, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786
M. Liu and J. Qu, “Mining high utility itemsets without candidate generation,” in the 21st ACM International Conference on Information and Knowledge Management, pp. 55–64, 2012
J. Liu, K. Wang, and B. C. M. Fung, “Direct discovery of high utility itemsets without candidate generation,” in Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 984–989, 2012
P. Fournier-Viger, C. W. Wu, S. Zida, and V. S. Tseng, “FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning,” in International Symposium on Methodologies for Intelligent Systems, vol. 8502 LNAI, pp. 83–92, 2014
Deng Z-H (2018) An efficient structure for fast mining high utility itemsets. Appl Intell 48(9):3161–3177
Zida S, Fournier-Viger P, Lin JCW, Wu CW, Tseng VS (2017) EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51(2):595–625
Krishnamoorthy S (2017) HMiner: efficiently mining high utility itemsets. Expert Syst Appl 90:168–183
Nguyen LTT, Nguyen P, Nguyen TDD, Vo B, Fournier-Viger P, Tseng VS (2019) Mining high-utility itemsets in dynamic profit databases. Knowl-Based Syst 175:130–144
Fournier-Viger P, Zhang Y, Chun-Wei Lin J, Fujita H, Koh YS (2019) Mining local and peak high utility itemsets. Inf Sci 481:344–367
Baek Y, Yun U, Kim H, Kim J, Vo B, Truong T, Deng Z-H (2021) Approximate high utility itemset mining in noisy environments. Knowl-Based Syst 212:106596
Nguyen LTT, Vu VV, Lam MTH, Duong TTM, Manh LT, Nguyen TTT, Vo B, Fujita H (2019) An efficient method for mining high utility closed itemsets. Inf Sci 495:78–99
Wei T, Wang B, Zhang Y, Hu K, Yao Y, Liu H (2020) FCHUIM: efficient frequent and closed high-utility Itemsets mining. IEEE Access 8:109928–109939
Nguyen LTT, Vu DB, Nguyen TDD, Vo B (2020) Mining maximal high utility Itemsets on dynamic profit databases. Cybern Syst 51(2):140–160
Yun U, Nam H, Kim J, Kim H, Baek Y, Lee J, Yoon E, Truong T, Vo B, Pedrycz W (2020) Efficient transaction deleting approach of pre-large based high utility pattern mining in dynamic databases. Futur Gener Comput Syst 103:58–78
G. Srivastava, J. C. Lin, M. Pirouz, Y. Li, and U. Yun, “A Pre-large Weighted-Fusion System of Sensed High-Utility Patterns,” IEEE Sensors Journal, p. 1, 2020
Vo B, Nguyen LV, Vu VV, Lam MTH, Duong TTM, Manh LT, Nguyen TTT, Nguyen LTT, Hong T-P (2020) Mining correlated high utility Itemsets in one phase. IEEE Access 8:90465–90477
Gan W, Lin JC-W, Chao H-C, Fujita H, Yu PS (2019) Correlated utility-based pattern mining. Inf Sci 504:470–486
Gan W, Lin JC-W, Zhang J, Chao H-C, Fujita H, Yu PS (2020) ProUM: projection-based utility mining on sequence data. Inf Sci 513:222–240
Nam H, Yun U, Yoon E, Lin JC-W (2020) Efficient approach of recent high utility stream pattern mining with indexed list structure and pruning strategy considering arrival times of transactions. Inf Sci 529:1–27
Vo B, Nguyen LTT, Nguyen TDD, Fournier-Viger P, Yun U (2020) A multi-Core approach to efficiently mining high-utility Itemsets in dynamic profit databases. IEEE Access 8:85890–85899
Wu JM-T, Srivastava G, Wei M, Yun U, Lin JC-W (2021) Fuzzy high-utility pattern mining in parallel and distributed Hadoop framework. Inf Sci 553:31–48
C.-W. Lin, T.-P. Hong, and W.-H. Lu, “Efficiently Mining High Average Utility Itemsets with a Tree Structure,” in Intelligent Information and Database Systems, pp. 131–139, 2010
Lin JCW, Li T, Fournier-Viger P, Hong TP, Zhan J, Voznak M (2016) An efficient algorithm to mine high average-utility itemsets. Adv Eng Inform 30(2):233–243
Lin JCW, Ren S, Fournier-Viger P, Hong TP (2017) EHAUPM: efficient high average-utility pattern mining with tighter upper bounds. IEEE Access 5:12927–12940
Kim J, Yun U, Yoon E, Lin JC-W, Fournier-Viger P (2020) One scan based high average-utility pattern mining in static and dynamic databases. Futur Gener Comput Syst 111:143–158
Kim H, Yun U, Baek Y, Kim J, Vo B, Yoon E, Fujita H (2021) Efficient list based mining of high average utility patterns with maximum average pruning strategies. Inf Sci 543:85–105
M. Nouioua, Y. Wang, P. Fournier-Viger, J. C.-W. Lin, and J. M.-T. Wu, “TKC: Mining Top-K Cross-Level High Utility Itemsets,” in 2020 International Conference on Data Mining Workshops (ICDMW), pp. 673–682, 2020
Acknowledgements
This research is funded by Vietnam National University HoChiMinh City (VNU-HCM) under grant number C2020-28-04.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tung, N.T., Nguyen, L.T.T., Nguyen, T.D.D. et al. An efficient method for mining multi-level high utility Itemsets. Appl Intell 52, 5475–5496 (2022). https://doi.org/10.1007/s10489-021-02681-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02681-z