New approaches for mining high utility itemsets with multiple utility thresholds

Bao Huynh¹,
N. T. Tung¹,
Trinh D. D. Nguyen²,
Cuong Trinh³,
Vaclav Snasel³ &
…
Loan Nguyen ORCID: orcid.org/0000-0001-6440-6462^4,5

305 Accesses
Explore all metrics

Abstract

Recently, two research directions have been noticed in data mining: frequent itemset mining (FIM) and high utility itemset mining (HUIM). The FIM process will output itemsets whose number of occurrences together exceeds or equals the required threshold, but this process ignores the beneficial attribute of each item. HUIM algorithms are proposed to overcome the disadvantage of FIM, but these algorithms only use a single threshold, which is unsuitable in the real world when applications often require different utility thresholds. HUIM algorithms with multi-threshold utilities are proposed, but these have high mining time and memory consumption. This paper thus presents an efficient method for Mining High Utility Itemsets with Multiple Utility Thresholds (MHUI-MUT). The article applies upper bounds and the strategy of pruning, thus reducing database scanning, and proposes a cut-off threshold to minimize the mining time.We also present a method to parallelize the algorithm to make the most of the performance of multi-core computers. The experimental results show the superior speed of the MHUI-MUT algorithm compared to the previous one, and the parallel version also outperforms the proposed sequential algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Efficient Method for Mining High-Utility Itemsets Using High-Average Utility Measure

A Parallel Algorithm for Mining High Utility Itemsets

EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

http://www.philippe-fournier-viger.com/spmf/index.php?link=datasets.php

Notes

References

Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. ACM SIGMOD Rec 22(2):207–216
Article Google Scholar
Nguyen D, Luo W, Phung D, Venkatesh S (2018) LTARM: A novel temporal association rule mining method to understand toxicities in a routine cancer treatment. Knowl-Based Syst 161:313–328
Article Google Scholar
Nguyen D, Luo W, Vo B, Pedrycz W (2020) Succinct contrast sets via false positive controlling with an application in clinical process redesign. Expert Syst Appl 161:113670
Article Google Scholar
Nguyen D, Luo W, Vo B, Nguyen LTT, Pedrycz W (2021) Con2Vec: Learning embedding representations for contrast sets. Knowl-Based Syst 229:107382
Article Google Scholar
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large Databases. In: Proceedings of the 20th international conference on very large data bases, vol 1215, p 487499
Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390
Article Google Scholar
Han J, Pei J, Yin Y, Mao R (2004) Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Min Knowl Disc 8(1):53–87
Article MathSciNet Google Scholar
Liu B, Hsu W, Ma Y (1999) Mining association rules with multiple minimum supports. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 99, p 337341. https://doi.org/10.1145/312129.312274
Hu Y-H, Chen Y-L (2006) Mining association rules with multiple minimum supports: a new mining algorithm and a support tuning mechanism. Decis Support Syst 42(1):1–24
Article Google Scholar
Yao H, Hamilton HJ, Butz CJ (2004) A Foundational approach to mining itemset utilities from databases. In: Proceedings of the 2004 SIAM international conference on data mining, vol 4, p 482486. https://doi.org/10.1137/1.9781611972740.51
Tseng VS, Shie B-EE, Wu C-WW, Yu PS (2013) Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases. IEEE Trans Knowl Data Eng 25(8):1772–1786
Article Google Scholar
Ahmed CF, Tanbeer SK (2009) Byeong-Soo Jeong, and Young-Koo Lee, “Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases.” IEEE Trans Knowl Data Eng 21(12):1708–1721
Liu Y-C, Cheng C-P, Tseng VS (2013) Mining differential top-k co-expression patterns from time course comparative gene expression datasets. BMC Bioinformatics 14(1):230
Article Google Scholar
Thilagu M, Nadarajan R (2012) Efficiently Mining of Effective Web Traversal Patterns with Average Utility. Procedia Technol 6:444–451
Article Google Scholar
Belghith K, Fournier-Viger P, Jawadi J (2022) Hui2Vec: learning transaction embedding through high utility itemsets. In: lecture notes in computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 13773. LNCS, p 211224. https://doi.org/10.1007/978-3-031-24094-2_15
Liu Y, Liao WK, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 3518. LNAI, Springer-Verlag, p 689695. https://doi.org/10.1007/11430919_79
Tseng VS, Wu CW, Shie BE, Yu PS (2010) UP-Growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, p 253262. https://doi.org/10.1145/1835804.1835839
Le B, Nguyen H, Vo B (2011) An efficient strategy for mining high utility itemsets. Int J Intell Inf Database Syst 5(2):164–176
Google Scholar
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM 12, p 55. https://doi.org/10.1145/2396761.2396773
Fournier-Viger P, Wu C-W, Zida S, Tseng VS (2014) FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Lecture notes in computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 8502. LNAI, p 8392. https://doi.org/10.1007/978-3-319-08326-1_9
Krishnamoorthy S (2017) HMiner: Efficiently mining high utility itemsets. Expert Syst Appl 90:168–183
Article Google Scholar
Duong Q-HH, Fournier-Viger P, Ramampiaro H, Nørvåg K, Dam T-LL (2018) Efficient high utility itemset mining using buffered utility-lists. Appl Intell 48(7):1859–1877
Article Google Scholar
Zida S, Fournier-Viger P, Lin JC-W, Wu C-W, Tseng VS (2017) EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51(2):595–625
Article Google Scholar
Nguyen LTT, Nguyen P, Nguyen TDD, Vo B, Fournier-Viger P, Tseng VS (2019) Mining high-utility itemsets in dynamic profit databases. Knowl-Based Syst 175:130–144
Article Google Scholar
Lin JC-W, Gan W, Fournier-Viger P, Hong T-P (2015) Mining high-utility itemsets with multiple minimum utility thresholds. In: Proceedings of the eighth international C* Conference on Computer Science & Software Engineering - C3S2E 15, pp 917. https://doi.org/10.1007/978-3-319-44403-1_5
Lin JC-W, Gan W, Fournier-Viger P, Hong T-P, Zhan J (2016) Efficient mining of high-utility itemsets using multiple minimum utility thresholds. Knowl-Based Syst 113:100–115
Article Google Scholar
Gan W, Lin JCW, Fournier-Viger P, Chao HC (2016) More efficient algorithms for mining high-utility itemsets with multiple minimum utility thresholds. In: Hartmann S, Ma H (eds) Database and expert systems applications. DEXA 2016. Lecture Notes in Computer Science, vol 9827. Springer, Cham. https://doi.org/10.1007/978-3-319-44403-1_5
Krishnamoorthy S (2018) Efficient mining of high utility itemsets with multiple minimum utility thresholds. Eng Appl Artif Intell 69:112–126
Article Google Scholar
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. ACM SIGMOD Rec 29(2):1–12
Article Google Scholar
Zaki MJ, Gouda K (2003) Fast vertical mining using diffsets. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 03, p 326. https://doi.org/10.1145/956750.956788
Kiran RU, Reddy PK (2011) Novel techniques to reduce search space in multiple minimum supports-based frequent pattern mining algorithms. In: Proceedings of the 14th International Conference on Extending Database Technology - EDBT/ICDT 11, p 11. https://doi.org/10.1145/1951365.1951370
Gan W, Lin JC-W, Fournier-Viger P, Chao H-C, Zhan J (2017) Mining of frequent patterns with multiple minimum supports. Eng Appl Artif Intell 60:83–96
Article Google Scholar
Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59(3):603–626
Article Google Scholar
Le B, Nguyen H, Cao TA, Vo B (2009) A novel algorithm for mining high utility itemsets. In: 2009 first asian conference on intelligent information and database systems, pp 1317. https://doi.org/10.1109/ACIIDS.2009.55
Wu P, Niu X, Fournier-Viger P, Huang C, Wang B (2022) UBP-Miner: An efficient bit based high utility itemset mining algorithm. Knowl-Based Syst 248:108865
Article Google Scholar
Cheng Z, Fang W, Shen W, Lin JC-W, Yuan B (2023) An efficient utility-list based high-utility itemset mining algorithm. Appl Intell 53(6):6992–7006
Article Google Scholar
Liu J, Wang K, Fung BCM (2012) Direct discovery of high utility itemsets without candidate generation. In: Proceedings IEEE International Conference on Data Mining, ICDM, pp 984989. https://doi.org/10.1109/ICDM.2012.20
Gan W, Lin JC-W, Fournier-Viger P, Chao H-C, Tseng VS, Yu PS (2021) A Survey of Utility-Oriented Pattern Mining. IEEE Trans Knowl Data Eng 33(4):1306–1327
Article Google Scholar
Singh K, Singh SS, Kumar A, Biswas B (2019) TKEH: an efficient algorithm for mining top-k high utility itemsets. Appl Intell 49(3):1078–1097
Article Google Scholar
Krishnamoorthy S (2019) Mining top-k high utility itemsets with effective threshold raising strategies. Expert Syst Appl 117:148–165
Article Google Scholar
Han X, Liu X, Li J, Gao H (2021) Efficient top-k high utility itemset mining on massive data. Inf Sci 557:382–406
Article MathSciNet Google Scholar
Nguyen LTT, Vu D-B, Nguyen TDD, Vo B (2020) Mining Maximal High Utility Itemsets on Dynamic Profit Databases. Cybern Syst 51(2):140–160
Article Google Scholar
Vo B, Nguyen LTT, Bui N, Nguyen TDD, Huynh V-N, Hong T-P (2020) An Efficient Method for Mining Closed Potential High-Utility Itemsets. IEEE Access 8:31813–31822
Article Google Scholar
Nguyen TDD, Nguyen LTT, Vu L, Vo B, Pedrycz W (2021) Efficient algorithms for mining closed high utility itemsets in dynamic profit databases. Expert Syst Appl 186:115741
Article Google Scholar
Yun U, Nam H, Lee G, Yoon E (2019) Efficient approach for incremental high utility pattern mining with indexed list structure. Futur Gener Comput Syst 95:221–239
Article Google Scholar
Tung NT, Nguyen LTT, Nguyen TDD, Vo B (2022) An efficient method for mining multi-level high utility Itemsets. Appl Intell 52(5):5475–5496
Article Google Scholar
Tung NT, Nguyen LTT, Nguyen TDD, Fourier-Viger P, Nguyen N-T, Vo B (2022) Efficient mining of cross-level high-utility itemsets in taxonomy quantitative databases. Inf Sci 587:41–62
Article Google Scholar
Alhusaini N, Li J, Fournier-Viger P, Hawbani A, Chen G (2022) Mining high utility itemset with multiple minimum utility thresholds based on utility deviation. In: 2022 IEEE International Conference on Data Mining Workshops (ICDMW), pp 49049. https://doi.org/10.1109/ICDMW58026.2022.00071
Nguyen TDD, Nguyen LTT, Vo B (2019) A parallel algorithm for mining high utility itemsets. In: Advances in intelligent systems and computing, vol 853. Springer Verlag, pp 286295. https://doi.org/10.1007/978-3-319-99996-8_26
Vo B, Nguyen LTT, Nguyen TDD, Fournier-Viger P, Yun U (2020) A Multi-Core Approach to Efficiently Mining High-Utility Itemsets in Dynamic Profit Databases. IEEE Access 8:85890–85899
Article Google Scholar
Nguyen LTT et al (2020) Efficient method for mining high utility itemsets using high-average utility measure. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 12496. LNAI, pp 305315. https://doi.org/10.1007/978-3-030-63007-2_24
Nguyen TDD, LTT Nguyen, Kozierkiewicz A, Pham T, Vo B (2021) An efficient approach for mining high-utility itemsets from multiple abstraction levels. In: Lecture notes in computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 12672. LNAI, p 92103. https://doi.org/10.1007/978-3-030-73280-6_8
Tung NT, Nguyen LTT, Nguyen TDD, Kozierkiewicz A (2021) Cross-level high-utility itemset mining using multi-core processing. In: Lecture notes in computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Note in Bioinformatics), vol 12876. LNAI, p 467479. https://doi.org/10.1007/978-3-030-88081-1_35

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Faculty of Information Technology, HUTECH University, Ho Chi Minh City, Vietnam
Bao Huynh & N. T. Tung
Faculty of Information Technology, Industrial University of Ho Chi Minh City, Ho Chi Minh City, Vietnam
Trinh D. D. Nguyen
Faculty of Electrical Engineering and Computer Science, VŠB-Technical University of Ostrava, Ostrava, Czech Republic
Cuong Trinh & Vaclav Snasel
School of Computer Science and Engineering, International University, Ho Chi Minh City, Vietnam
Loan Nguyen
Vietnam National University, Ho Chi Minh City, Vietnam
Loan Nguyen

Authors

Bao Huynh
View author publications
You can also search for this author in PubMed Google Scholar
N. T. Tung
View author publications
You can also search for this author in PubMed Google Scholar
Trinh D. D. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Cuong Trinh
View author publications
You can also search for this author in PubMed Google Scholar
Vaclav Snasel
View author publications
You can also search for this author in PubMed Google Scholar
Loan Nguyen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Bao Huynh: Methodology, Writing-Original draft preparation; N.T. Tung: Data curation, Software, Validation, Writing—Review & Editing; Trinh D.D. Nguyen: Validation, Writing—Review & Editing; Vaclav Snasel: Validation, Writing—Review & Editing; Loan T.T. Nguyen: Methodology, Validation, Writing—Review & Editing.

Corresponding author

Correspondence to Loan Nguyen.

Ethics declarations

Competing Interests

The authors have no relevant financial or non-financial interests to disclose.

Ethical and informed consent for data used.

We use public datasets in our experiments.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huynh, B., Tung, N.T., Nguyen, T.D.D. et al. New approaches for mining high utility itemsets with multiple utility thresholds. Appl Intell 54, 767–790 (2024). https://doi.org/10.1007/s10489-023-05145-8

Download citation

Accepted: 30 October 2023
Published: 19 December 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s10489-023-05145-8

New approaches for mining high utility itemsets with multiple utility thresholds

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Method for Mining High-Utility Itemsets Using High-Average Utility Measure

A Parallel Algorithm for Mining High Utility Itemsets

EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical and informed consent for data used.

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

New approaches for mining high utility itemsets with multiple utility thresholds

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Method for Mining High-Utility Itemsets Using High-Average Utility Measure

A Parallel Algorithm for Mining High Utility Itemsets

EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining

Explore related subjects

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical and informed consent for data used.

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now