More Web Proxy on the site http://driver.im/

article

A survey of incremental high-utility itemset mining

Authors:

Jerry Chun-Wei Lin,

Philippe Fournier-Viger,

Han-Chieh Chao,

Tzung-Pei Hong,

Hamido FujitaAuthors Info & Claims

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Volume 8, Issue 2

Page n/a

https://doi.org/10.1002/widm.1242

Published: 01 March 2018 Publication History

Abstract

Traditional association rule mining has been widely studied. But it is unsuitable for real-world applications where factors such as unit profits of items and purchase quantities must be considered. High-utility itemset mining HUIM is designed to find highly profitable patterns by considering both the purchase quantities and unit profits of items. However, most HUIM algorithms are designed to be applied to static databases. But in real-world applications such as market basket analysis and business decision-making, databases are often dynamically updated by inserting new data such as customer transactions. Several researchers have proposed algorithms to discover high-utility itemsets HUIs in dynamically updated databases. Unlike batch algorithms, which always process a database from scratch, incremental high-utility itemset mining iHUIM algorithms incrementally update and output HUIs, thus reducing the cost of discovering HUIs. This paper provides an up-to-date survey of the state-of-the-art iHUIM algorithms, including Apriori-based, tree-based, and utility-list-based approaches. To the best of our knowledge, this is the first survey on the mining task of incremental high-utility itemset mining. The paper also identifies several important issues and research challenges for iHUIM. WIREs Data Mining Knowl Discov 2018, 8:e1242. doi: 10.1002/widm.1242

References

[1]

Agrawal, R., Imielinski, T., &Swami, A. 1993. Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering, Volume 5 Issue 6, pp.914-925.

Digital Library

[2]

Agrawal, R., &Srikant, R. 1994. Fast algorithms for mining association rules in large databases. In: Proceedings of the International Conference on Very Large Data Bases, Santiago de Chile, Chile, pp. 487-499.

[3]

Ahmed, C. F., Tanbeer, S. K., Jeong, B. S., &Le, Y. K. 2009. Efficient tree structures for high utility pattern mining in incremental databases. IEEE Transactions on Knowledge and Data Engineering, Volume 21 Issue 12, pp.1708-1721.

Digital Library

[4]

Chan, R., Yang, Q., &Shen, Y. D. 2003. Mining high utility itemsets. In: Proceedings of the IEEE International Conference on Data Mining, Melbourne, FL, pp. 19-26.

[5]

Chen, M. S., Han, J., &Yu, P. S. 1996. Data mining: An overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering, Volume 8 Issue 6, pp.866-883.

Digital Library

[6]

Cheung, D. W. L., Han, J., Ng, V., &Wong, C. Y. 1996. Maintenance of discovered association rules in large databases: An incremental updating technique. In: Proceedings of the International Conference on Data Engineering, New Orleans, LA, pp. 106-114.

[7]

Dean, J., &Ghemawat, S. 2010. MapReduce: A flexible data processing tool. Communications of the ACM, Volume 53 Issue 1, pp.72-77.

[8]

Erwin, A., Gopalan, R. P., &Achuthan, N. R. 2007. CTU-mine: An efficient high utility itemset mining algorithm using the pattern growth approach. In: Proceedings of the IEEE International Conference on Computer and Information Technology, Fukushima, Japan, pp. 71-76.

[9]

Fournier-Viger, P., Lin, J. C. W., Gueniche, T., &Barhate, P. 2015. Efficient incremental high utility itemset mining. In: Proceedings of the ASE BigData & Social Informatics, Kaohsiung, Taiwan, p. 53.

[10]

Fournier-Viger, P., Lin, J. C.-W., Vo, B., Chi, T. T., Zhang, J., &Le, H. B. 2017. A survey of itemset mining. WIREs: Data Mining and Knowledge Discovery, Volume 7 Issue 4, pp.e1207.

[11]

Fournier-Viger, P., Wu, C. W., Zida, S., &Tseng, V. S. 2014. FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Proceedings of the International symposium on methodologies for intelligent systems, Roskilde, Denmark, pp. 83-92.

[12]

Gan, W., Lin, J. C. W., Chao, H. C., &Zhan, J. 2017. Data mining in distributed environment: A survey. WIREs: Data Mining and Knowledge Discovery, Volume 7 6, e1216.

[13]

Han, J., Pei, J., Yin, Y., &Mao, R. 2004. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, Volume 8 Issue 1, pp.53-87.

[14]

Hong, T. P., Kuo, C. S., &Chi, S. C. 1999. Mining association rules from quantitative data. Intelligent Data Analysis, Volume 3 Issue 5, pp.363-376.

[15]

Hong, T. P., Lee, C. H., &Wang, S. L. 2011. Effective utility mining with the measure of average utility. Expert Systems with Applications, Volume 38 Issue 7, pp.8259-8265.

Digital Library

[16]

Hong, T. P., Lin, C. W., &Wu, Y. L. 2008. Incrementally fast updated frequent pattern trees. Expert Systems with Applications, Volume 34 Issue 4, pp.2424-2435.

[17]

Hong, T. P., Lin, K. Y., &Chien, B. C. 2003. Mining fuzzy multiple-level association rules from quantitative data. Applied Intelligence, Volume 18 Issue 1, pp.79-90.

[18]

Hong, T. P., Wang, C. Y., &Tao, Y. H. 2001. A new incremental data mining algorithm using pre-large itemsets. Intelligent Data Analysis, Volume 5 Issue 2, pp.111-129.

[19]

Kim, D., &Yun, U. 2017. Efficient algorithm for mining high average-utility itemsets in incremental transaction databases. Applied Intelligence, Volume 47 Issue 1, pp.114-131.

Digital Library

[20]

Krishnamoorthy, S. 2015. Pruning strategies for mining high utility itemsets. Expert Systems with Applications, Volume 42 Issue 5, pp.2371-2381.

[21]

Lan, G. C., Hong, T. P., Huang, J. P., &Tseng, V. S. 2014. On-shelf utility mining with negative item values. Expert Systems with Applications, Volume 41 Issue 7, pp.3450-3459.

[22]

Lan, G. C., Hong, T. P., &Tseng, V. S. 2012. A projection-based approach for discovering high average-utility itemsets. Journal of Information Science and Engineering, Volume 28 Issue 1, pp.193-209.

[23]

Lan, G. C., Hong, T. P., &Tseng, V. S. 2013. An efficient projection-based indexing approach for mining high utility itemsets. Knowledge and Information Systems, Volume 38 Issue 1, pp.85-107.

[24]

Li, Y. C., Yeh, J. S., &Chang, C. C. 2005. Efficient algorithms for mining share-frequent itemsets. In: Proceedings of the 11th World Congress of Intl. Fuzzy Systems Association, Beijing, China, pp. 534-539.

[25]

Lin, C. W., Hong, T. P., Lan, G. C., Wong, J. W., &Lin, W. Y. 2014. Incrementally mining high utility patterns based on pre-large concept. Applied Intelligence, Volume 40 Issue 2, pp.343-357.

Digital Library

[26]

Lin, C. W., Hong, T. P., &Lu, W. H. 2009. The pre-FUFP algorithm for incremental mining. Expert Systems with Applications, Volume 36 Issue 5, pp.9498-9505.

[27]

Lin, C. W., Hong, T. P., &Lu, W. H. 2011. An effective tree structure for mining high utility itemsets. Expert Systems with Applications, Volume 38 Issue 6, pp.7419-7424.

[28]

Lin, C. W., Lan, G. C., &Hong, T. P. 2012. An incremental mining algorithm for high utility itemsets. Expert Systems with Applications, Volume 39 Issue 8, pp.7173-7180.

[29]

Lin, J. C. W., Fournier-Viger, P., &Gan, W. 2016. FHN: An efficient algorithm for mining high-utility itemsets with negative unit profits. Knowledge-Based Systems, 111, 283-298.

Digital Library

[30]

Lin, J. C. W., Gan, W., Fournier-Viger, P., Hong, T. P., &Chao, H. C. 2017. FDHUP: Fast algorithm for mining discriminative high utility patterns. Knowledge and Information Systems, Volume 51 Issue 3, pp.873-909.

[31]

Lin, J. C. W., Gan, W., Fournier-Viger, P., Hong, T. P., &Tseng, V. S. 2016. Fast algorithms for mining high-utility itemsets with various discount strategies. Advanced Engineering Informatics, Volume 30 Issue 2, pp.109-126.

Digital Library

[32]

Lin, J. C. W., Gan, W., Fournier-Viger, P., Hong, T. P., &Zhan, J. 2016. Efficient mining of high-utility itemsets using multiple minimum utility thresholds. Knowledge-Based Systems, Volume 113, pp.100-115.

Digital Library

[33]

Lin, J. C. W., Gan, W., &Hong, T. P. 2015. A fast updated algorithm to maintain the discovered high-utility itemsets for transaction modification. Advanced Engineering Informatics, Volume 29 Issue 3, pp.562-574.

Digital Library

[34]

Lin, J. C. W., Gan, W., &Hong, T. P. 2016. A fast maintenance algorithm of the discovered high-utility itemsets with transaction deletion. Intelligent Data Analysis, Volume 20 Issue 4, pp.891-913.

[35]

Lin, J. C. W., Gan, W., Hong, T. P., &Tseng, V. S. 2015. Efficient algorithms for mining up-to-date high-utility patterns. Advanced Engineering Informatics, Volume 29 Issue 3, pp.648-661.

Digital Library

[36]

Lin, J. C. W., Gan, W., Hong, T. P., &Zhang, B. 2015. An incremental high-utility mining algorithm with transaction insertion. The Scientific World Journal, Volume 2015, pp.1-15.

[37]

Lin, Y. C., Wu, C. W., &Tseng, V. S. 2015. Mining high utility itemsets in big data. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Ho Chi Minh City, Vietnam, pp. 649-661.

[38]

Liu, J., Wang, K., &Fung, B. C. M. 2012. Direct discovery of high utility itemsets without candidate generation. In: Proceedings of the IEEE 12th International Conference on Data Mining, Brussels, Belgium, pp. 984-989.

Digital Library

[39]

Liu, M., &Qu, J. 2012. Mining high utility itemsets without candidate generation. In: Proceedings of the ACM International Conference on Information and Knowledge Management, Maui, HI, pp. 55-64.

[40]

Liu, Y., Liao, W. K., &Choudhary, A. 2005. A two-phase algorithm for fast discovery of high utility itemsets. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hanoi, Vietnam, pp. 689-695.

[41]

Mai, T., Vo, B., &Nguyen, L. T. T. 2017. A lattice-based approach for mining high utility association rules. Information Sciences, Volume 399, pp.81-97.

Digital Library

[42]

Martínez-Ballesteros, M., Martínez-Álvarez, F., Troncoso, A., &Riquelme, J. C. 2014. Selecting the best measures to discover quantitative association rules. Neurocomputing, Volume 126, pp.3-14.

Digital Library

[43]

Ng, R. T., Lakshmanan, L., Han, J., &Pang, A. 1998. Exploratory mining and pruning optimizations of constrained association rules. ACM SIGMOD Record, Volume 27 Issue 2, pp.13-24.

[44]

Pei, J., &Han, J. 2002. Constrained frequent pattern mining: A pattern-growth view. ACM SIGKDD Explorations Newsletter, Volume 4 Issue 1, pp.31-39.

[45]

Ryang, H., &Yun, U. 2015. Top-k high utility pattern mining with effective threshold raising strategies. Knowledge-Based Systems, Volume 76, pp.109-126.

Digital Library

[46]

Ryang, H., &Yun, U. 2016. Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques. Knowledge and Information Systems, 512, pp.627-659.

[47]

Tseng, V. S., Shie, B. E., Wu, C. W., &Yu, P. S. 2013. Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Transactions on Knowledge and Data Engineering, Volume 25 Issue 8, pp.1772-1786.

Digital Library

[48]

Tseng, V. S., Wu, C. W., Fournier-Viger, P., &Yu, P. S. 2016. Efficient algorithms for mining top-k high utility itemsets. IEEE Transactions on Knowledge and Data Engineering, Volume 28 Issue 1, pp.54-67.

Digital Library

[49]

Tseng, V. S., Wu, C. W., Shie, B. E., &Yu, P. S. UP-Growth: An efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, pp. 253-262, 2010.

Digital Library

[50]

Yao, H., &Hamilton, H. J. 2006. Mining itemset utilities from transaction databases. Data & Knowledge Engineering, Volume 59 Issue 3, pp.603-626.

[51]

Yao, H., Hamilton, H. J., &Butz, C. J. 2004. A foundational approach to mining itemset utilities from databases. In: Proceedings of the SIAM International Conference on Data Mining, Orlando, FL, pp. 211-225.

[52]

Yeh, J. S., Chang, C. Y., &Wang, Y. T. 2008. Efficient algorithms for incremental utility mining. In: Proceedings of the ACM International Conference on Ubiquitous Information Management and Cmmunication, Suwon, Korea, pp. 212-217.

[53]

Yun, U., &Kim, D. 2017. Mining of high average-utility itemsets using novel list structure and pruning strategy. Future Generation Computer Systems, Volume 68, pp.346-360.

[54]

Yun, U., &Ryang, H. 2014. Incremental high utility pattern mining with static and dynamic databases. Applied Intelligence, Volume 42 Issue 2, pp.323-352.

Digital Library

[55]

Yun, U., Ryang, H., Lee, G., &Fujita, H. 2017. An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowledge-Based Systems, Volume 124, pp.188-206.

Digital Library

[56]

Zaharia, M., Chowdhury, M., Das, T., Dave, A., &Ma, J. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, San Jose, CA, USA, pp. 2-2.

Digital Library

[57]

Zheng, H. T., &Li, Z. iCHUM: An efficient algorithm for high utility mining in incremental databases. In: Proceedings of the International Conference on Knowledge Science, Engineering and Management, Chongqing, China, pp. 212-223, 2015.

Digital Library

[58]

Zida, S., Fournier-Viger, P., Lin, J. C. W., Wu, C. W., &Tseng, V. S. 2015. EFIM: A highly efficient algorithm for high-utility itemset mining. In: Proceedings of the 14th Mexican Intern. Conference on Artificial Intelligence, Cuernavaca, Mexico, pp. 530-546.

Cited By

Huang SGan WMiao JHan XFournier-Viger P(2024)Targeted mining of top-k high utility itemsetsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107047126:PDOnline publication date: 27-Feb-2024
https://dl.acm.org/doi/10.1016/j.engappai.2023.107047
Zhang CYang YDu ZGan WYu P(2023)HUSP-SP: Faster Utility Mining on Sequence DataACM Transactions on Knowledge Discovery from Data10.1145/359793518:1(1-21)Online publication date: 10-Aug-2023
https://dl.acm.org/doi/10.1145/3597935
Zhou TQi ZGan WWan SChen G(2023)Weighted Statistically Significant Pattern MiningCompanion Proceedings of the ACM Web Conference 202310.1145/3543873.3587586(1276-1285)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543873.3587586
Show More Cited By

A survey of incremental high-utility itemset mining
1. Information systems
  1. Information systems applications

Recommendations

Incremental High Fuzzy Utility Itemset Mining
MISNC '22: Proceedings of the 9th Multidisciplinary International Social Networks Conference

In data mining, frequent-pattern mining methods are used for handling binary databases. Utility mining addresses this limitation by considering the item utilities and quantities when discovering the high utility itemsets. In addition, to make these high ...
Mining high utility patterns in incremental databases
ICUIMC '09: Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication

Frequent pattern mining techniques treat all items in the database equally by taking into consideration only the presence of an item within a transaction. However, the customer may purchase more than one of the same item, and the unit price may vary ...
An effective approach for maintenance of pre-large-based frequent-itemset lattice in incremental mining

Incremental mining has attracted the attention of many researchers due to its usefulness in online applications. Many algorithms have thus been proposed for incrementally mining frequent itemsets. Maintaining a frequent-itemset lattice (FIL) is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery Volume 8, Issue 2

March 2018

EISSN:1942-4795

Issue’s Table of Contents

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 01 March 2018

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

44
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Huang SGan WMiao JHan XFournier-Viger P(2024)Targeted mining of top-k high utility itemsetsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107047126:PDOnline publication date: 27-Feb-2024
https://dl.acm.org/doi/10.1016/j.engappai.2023.107047
Zhang CYang YDu ZGan WYu P(2023)HUSP-SP: Faster Utility Mining on Sequence DataACM Transactions on Knowledge Discovery from Data10.1145/359793518:1(1-21)Online publication date: 10-Aug-2023
https://dl.acm.org/doi/10.1145/3597935
Zhou TQi ZGan WWan SChen G(2023)Weighted Statistically Significant Pattern MiningCompanion Proceedings of the ACM Web Conference 202310.1145/3543873.3587586(1276-1285)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543873.3587586
Kang KWang ZZhang H(2022)Media Content Mining Based on Artificial Intelligence and Network InteractionMobile Information Systems10.1155/2022/67059862022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/6705986
Alsaeedi HAlhegami A(2022)An Incremental Interesting Maximal Frequent Itemset Mining Based on FP-Growth AlgorithmComplexity10.1155/2022/19425172022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/1942517
Li JGan WGui YWu YYu PAl Hasan MXiong L(2022)Frequent Itemset Mining with Local Differential PrivacyProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557327(1146-1155)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557327
Gui YGan WChen YWu Y(2022)Mining with Rarity for Web IntelligenceCompanion Proceedings of the Web Conference 202210.1145/3487553.3524708(973-981)Online publication date: 25-Apr-2022
https://dl.acm.org/doi/10.1145/3487553.3524708
Wan SChen JQi ZGan WTang L(2022)Fast RFM Model for Customer SegmentationCompanion Proceedings of the Web Conference 202210.1145/3487553.3524707(965-972)Online publication date: 25-Apr-2022
https://dl.acm.org/doi/10.1145/3487553.3524707
Wan SChen JZhang PGan WGu T(2022)Discovering Top-k Profitable Patterns for Smart ManufacturingCompanion Proceedings of the Web Conference 202210.1145/3487553.3524706(956-964)Online publication date: 25-Apr-2022
https://dl.acm.org/doi/10.1145/3487553.3524706
Lee CRyu TKim HKim HVo BLin JYun U(2022)Efficient approach of sliding window-based high average-utility pattern mining with list structuresKnowledge-Based Systems10.1016/j.knosys.2022.109702256:COnline publication date: 28-Nov-2022
https://dl.acm.org/doi/10.1016/j.knosys.2022.109702
Show More Cited By

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents