[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

A survey of incremental high-utility itemset mining

Published: 01 March 2018 Publication History

Abstract

Traditional association rule mining has been widely studied. But it is unsuitable for real-world applications where factors such as unit profits of items and purchase quantities must be considered. High-utility itemset mining HUIM is designed to find highly profitable patterns by considering both the purchase quantities and unit profits of items. However, most HUIM algorithms are designed to be applied to static databases. But in real-world applications such as market basket analysis and business decision-making, databases are often dynamically updated by inserting new data such as customer transactions. Several researchers have proposed algorithms to discover high-utility itemsets HUIs in dynamically updated databases. Unlike batch algorithms, which always process a database from scratch, incremental high-utility itemset mining iHUIM algorithms incrementally update and output HUIs, thus reducing the cost of discovering HUIs. This paper provides an up-to-date survey of the state-of-the-art iHUIM algorithms, including Apriori-based, tree-based, and utility-list-based approaches. To the best of our knowledge, this is the first survey on the mining task of incremental high-utility itemset mining. The paper also identifies several important issues and research challenges for iHUIM. WIREs Data Mining Knowl Discov 2018, 8:e1242. doi: 10.1002/widm.1242

References

[1]
Agrawal, R., Imielinski, T., &Swami, A. 1993. Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering, Volume 5 Issue 6, pp.914-925.
[2]
Agrawal, R., &Srikant, R. 1994. Fast algorithms for mining association rules in large databases. In: Proceedings of the International Conference on Very Large Data Bases, Santiago de Chile, Chile, pp. 487-499.
[3]
Ahmed, C. F., Tanbeer, S. K., Jeong, B. S., &Le, Y. K. 2009. Efficient tree structures for high utility pattern mining in incremental databases. IEEE Transactions on Knowledge and Data Engineering, Volume 21 Issue 12, pp.1708-1721.
[4]
Chan, R., Yang, Q., &Shen, Y. D. 2003. Mining high utility itemsets. In: Proceedings of the IEEE International Conference on Data Mining, Melbourne, FL, pp. 19-26.
[5]
Chen, M. S., Han, J., &Yu, P. S. 1996. Data mining: An overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering, Volume 8 Issue 6, pp.866-883.
[6]
Cheung, D. W. L., Han, J., Ng, V., &Wong, C. Y. 1996. Maintenance of discovered association rules in large databases: An incremental updating technique. In: Proceedings of the International Conference on Data Engineering, New Orleans, LA, pp. 106-114.
[7]
Dean, J., &Ghemawat, S. 2010. MapReduce: A flexible data processing tool. Communications of the ACM, Volume 53 Issue 1, pp.72-77.
[8]
Erwin, A., Gopalan, R. P., &Achuthan, N. R. 2007. CTU-mine: An efficient high utility itemset mining algorithm using the pattern growth approach. In: Proceedings of the IEEE International Conference on Computer and Information Technology, Fukushima, Japan, pp. 71-76.
[9]
Fournier-Viger, P., Lin, J. C. W., Gueniche, T., &Barhate, P. 2015. Efficient incremental high utility itemset mining. In: Proceedings of the ASE BigData & Social Informatics, Kaohsiung, Taiwan, p. 53.
[10]
Fournier-Viger, P., Lin, J. C.-W., Vo, B., Chi, T. T., Zhang, J., &Le, H. B. 2017. A survey of itemset mining. WIREs: Data Mining and Knowledge Discovery, Volume 7 Issue 4, pp.e1207.
[11]
Fournier-Viger, P., Wu, C. W., Zida, S., &Tseng, V. S. 2014. FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Proceedings of the International symposium on methodologies for intelligent systems, Roskilde, Denmark, pp. 83-92.
[12]
Gan, W., Lin, J. C. W., Chao, H. C., &Zhan, J. 2017. Data mining in distributed environment: A survey. WIREs: Data Mining and Knowledge Discovery, Volume 7 6, e1216.
[13]
Han, J., Pei, J., Yin, Y., &Mao, R. 2004. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, Volume 8 Issue 1, pp.53-87.
[14]
Hong, T. P., Kuo, C. S., &Chi, S. C. 1999. Mining association rules from quantitative data. Intelligent Data Analysis, Volume 3 Issue 5, pp.363-376.
[15]
Hong, T. P., Lee, C. H., &Wang, S. L. 2011. Effective utility mining with the measure of average utility. Expert Systems with Applications, Volume 38 Issue 7, pp.8259-8265.
[16]
Hong, T. P., Lin, C. W., &Wu, Y. L. 2008. Incrementally fast updated frequent pattern trees. Expert Systems with Applications, Volume 34 Issue 4, pp.2424-2435.
[17]
Hong, T. P., Lin, K. Y., &Chien, B. C. 2003. Mining fuzzy multiple-level association rules from quantitative data. Applied Intelligence, Volume 18 Issue 1, pp.79-90.
[18]
Hong, T. P., Wang, C. Y., &Tao, Y. H. 2001. A new incremental data mining algorithm using pre-large itemsets. Intelligent Data Analysis, Volume 5 Issue 2, pp.111-129.
[19]
Kim, D., &Yun, U. 2017. Efficient algorithm for mining high average-utility itemsets in incremental transaction databases. Applied Intelligence, Volume 47 Issue 1, pp.114-131.
[20]
Krishnamoorthy, S. 2015. Pruning strategies for mining high utility itemsets. Expert Systems with Applications, Volume 42 Issue 5, pp.2371-2381.
[21]
Lan, G. C., Hong, T. P., Huang, J. P., &Tseng, V. S. 2014. On-shelf utility mining with negative item values. Expert Systems with Applications, Volume 41 Issue 7, pp.3450-3459.
[22]
Lan, G. C., Hong, T. P., &Tseng, V. S. 2012. A projection-based approach for discovering high average-utility itemsets. Journal of Information Science and Engineering, Volume 28 Issue 1, pp.193-209.
[23]
Lan, G. C., Hong, T. P., &Tseng, V. S. 2013. An efficient projection-based indexing approach for mining high utility itemsets. Knowledge and Information Systems, Volume 38 Issue 1, pp.85-107.
[24]
Li, Y. C., Yeh, J. S., &Chang, C. C. 2005. Efficient algorithms for mining share-frequent itemsets. In: Proceedings of the 11th World Congress of Intl. Fuzzy Systems Association, Beijing, China, pp. 534-539.
[25]
Lin, C. W., Hong, T. P., Lan, G. C., Wong, J. W., &Lin, W. Y. 2014. Incrementally mining high utility patterns based on pre-large concept. Applied Intelligence, Volume 40 Issue 2, pp.343-357.
[26]
Lin, C. W., Hong, T. P., &Lu, W. H. 2009. The pre-FUFP algorithm for incremental mining. Expert Systems with Applications, Volume 36 Issue 5, pp.9498-9505.
[27]
Lin, C. W., Hong, T. P., &Lu, W. H. 2011. An effective tree structure for mining high utility itemsets. Expert Systems with Applications, Volume 38 Issue 6, pp.7419-7424.
[28]
Lin, C. W., Lan, G. C., &Hong, T. P. 2012. An incremental mining algorithm for high utility itemsets. Expert Systems with Applications, Volume 39 Issue 8, pp.7173-7180.
[29]
Lin, J. C. W., Fournier-Viger, P., &Gan, W. 2016. FHN: An efficient algorithm for mining high-utility itemsets with negative unit profits. Knowledge-Based Systems, 111, 283-298.
[30]
Lin, J. C. W., Gan, W., Fournier-Viger, P., Hong, T. P., &Chao, H. C. 2017. FDHUP: Fast algorithm for mining discriminative high utility patterns. Knowledge and Information Systems, Volume 51 Issue 3, pp.873-909.
[31]
Lin, J. C. W., Gan, W., Fournier-Viger, P., Hong, T. P., &Tseng, V. S. 2016. Fast algorithms for mining high-utility itemsets with various discount strategies. Advanced Engineering Informatics, Volume 30 Issue 2, pp.109-126.
[32]
Lin, J. C. W., Gan, W., Fournier-Viger, P., Hong, T. P., &Zhan, J. 2016. Efficient mining of high-utility itemsets using multiple minimum utility thresholds. Knowledge-Based Systems, Volume 113, pp.100-115.
[33]
Lin, J. C. W., Gan, W., &Hong, T. P. 2015. A fast updated algorithm to maintain the discovered high-utility itemsets for transaction modification. Advanced Engineering Informatics, Volume 29 Issue 3, pp.562-574.
[34]
Lin, J. C. W., Gan, W., &Hong, T. P. 2016. A fast maintenance algorithm of the discovered high-utility itemsets with transaction deletion. Intelligent Data Analysis, Volume 20 Issue 4, pp.891-913.
[35]
Lin, J. C. W., Gan, W., Hong, T. P., &Tseng, V. S. 2015. Efficient algorithms for mining up-to-date high-utility patterns. Advanced Engineering Informatics, Volume 29 Issue 3, pp.648-661.
[36]
Lin, J. C. W., Gan, W., Hong, T. P., &Zhang, B. 2015. An incremental high-utility mining algorithm with transaction insertion. The Scientific World Journal, Volume 2015, pp.1-15.
[37]
Lin, Y. C., Wu, C. W., &Tseng, V. S. 2015. Mining high utility itemsets in big data. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Ho Chi Minh City, Vietnam, pp. 649-661.
[38]
Liu, J., Wang, K., &Fung, B. C. M. 2012. Direct discovery of high utility itemsets without candidate generation. In: Proceedings of the IEEE 12th International Conference on Data Mining, Brussels, Belgium, pp. 984-989.
[39]
Liu, M., &Qu, J. 2012. Mining high utility itemsets without candidate generation. In: Proceedings of the ACM International Conference on Information and Knowledge Management, Maui, HI, pp. 55-64.
[40]
Liu, Y., Liao, W. K., &Choudhary, A. 2005. A two-phase algorithm for fast discovery of high utility itemsets. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hanoi, Vietnam, pp. 689-695.
[41]
Mai, T., Vo, B., &Nguyen, L. T. T. 2017. A lattice-based approach for mining high utility association rules. Information Sciences, Volume 399, pp.81-97.
[42]
Martínez-Ballesteros, M., Martínez-Álvarez, F., Troncoso, A., &Riquelme, J. C. 2014. Selecting the best measures to discover quantitative association rules. Neurocomputing, Volume 126, pp.3-14.
[43]
Ng, R. T., Lakshmanan, L., Han, J., &Pang, A. 1998. Exploratory mining and pruning optimizations of constrained association rules. ACM SIGMOD Record, Volume 27 Issue 2, pp.13-24.
[44]
Pei, J., &Han, J. 2002. Constrained frequent pattern mining: A pattern-growth view. ACM SIGKDD Explorations Newsletter, Volume 4 Issue 1, pp.31-39.
[45]
Ryang, H., &Yun, U. 2015. Top-k high utility pattern mining with effective threshold raising strategies. Knowledge-Based Systems, Volume 76, pp.109-126.
[46]
Ryang, H., &Yun, U. 2016. Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques. Knowledge and Information Systems, 512, pp.627-659.
[47]
Tseng, V. S., Shie, B. E., Wu, C. W., &Yu, P. S. 2013. Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Transactions on Knowledge and Data Engineering, Volume 25 Issue 8, pp.1772-1786.
[48]
Tseng, V. S., Wu, C. W., Fournier-Viger, P., &Yu, P. S. 2016. Efficient algorithms for mining top-k high utility itemsets. IEEE Transactions on Knowledge and Data Engineering, Volume 28 Issue 1, pp.54-67.
[49]
Tseng, V. S., Wu, C. W., Shie, B. E., &Yu, P. S. UP-Growth: An efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, pp. 253-262, 2010.
[50]
Yao, H., &Hamilton, H. J. 2006. Mining itemset utilities from transaction databases. Data & Knowledge Engineering, Volume 59 Issue 3, pp.603-626.
[51]
Yao, H., Hamilton, H. J., &Butz, C. J. 2004. A foundational approach to mining itemset utilities from databases. In: Proceedings of the SIAM International Conference on Data Mining, Orlando, FL, pp. 211-225.
[52]
Yeh, J. S., Chang, C. Y., &Wang, Y. T. 2008. Efficient algorithms for incremental utility mining. In: Proceedings of the ACM International Conference on Ubiquitous Information Management and Cmmunication, Suwon, Korea, pp. 212-217.
[53]
Yun, U., &Kim, D. 2017. Mining of high average-utility itemsets using novel list structure and pruning strategy. Future Generation Computer Systems, Volume 68, pp.346-360.
[54]
Yun, U., &Ryang, H. 2014. Incremental high utility pattern mining with static and dynamic databases. Applied Intelligence, Volume 42 Issue 2, pp.323-352.
[55]
Yun, U., Ryang, H., Lee, G., &Fujita, H. 2017. An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowledge-Based Systems, Volume 124, pp.188-206.
[56]
Zaharia, M., Chowdhury, M., Das, T., Dave, A., &Ma, J. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, San Jose, CA, USA, pp. 2-2.
[57]
Zheng, H. T., &Li, Z. iCHUM: An efficient algorithm for high utility mining in incremental databases. In: Proceedings of the International Conference on Knowledge Science, Engineering and Management, Chongqing, China, pp. 212-223, 2015.
[58]
Zida, S., Fournier-Viger, P., Lin, J. C. W., Wu, C. W., &Tseng, V. S. 2015. EFIM: A highly efficient algorithm for high-utility itemset mining. In: Proceedings of the 14th Mexican Intern. Conference on Artificial Intelligence, Cuernavaca, Mexico, pp. 530-546.

Cited By

View all
  1. A survey of incremental high-utility itemset mining

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
    Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery  Volume 8, Issue 2
    March 2018

    Publisher

    John Wiley & Sons, Inc.

    United States

    Publication History

    Published: 01 March 2018

    Author Tags

    1. data mining
    2. dynamic database
    3. high utility
    4. incremental mining
    5. quantities

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Targeted mining of top-k high utility itemsetsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107047126:PDOnline publication date: 27-Feb-2024
    • (2023)HUSP-SP: Faster Utility Mining on Sequence DataACM Transactions on Knowledge Discovery from Data10.1145/359793518:1(1-21)Online publication date: 10-Aug-2023
    • (2023)Weighted Statistically Significant Pattern MiningCompanion Proceedings of the ACM Web Conference 202310.1145/3543873.3587586(1276-1285)Online publication date: 30-Apr-2023
    • (2022)Media Content Mining Based on Artificial Intelligence and Network InteractionMobile Information Systems10.1155/2022/67059862022Online publication date: 1-Jan-2022
    • (2022)An Incremental Interesting Maximal Frequent Itemset Mining Based on FP-Growth AlgorithmComplexity10.1155/2022/19425172022Online publication date: 1-Jan-2022
    • (2022)Frequent Itemset Mining with Local Differential PrivacyProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557327(1146-1155)Online publication date: 17-Oct-2022
    • (2022)Mining with Rarity for Web IntelligenceCompanion Proceedings of the Web Conference 202210.1145/3487553.3524708(973-981)Online publication date: 25-Apr-2022
    • (2022)Fast RFM Model for Customer SegmentationCompanion Proceedings of the Web Conference 202210.1145/3487553.3524707(965-972)Online publication date: 25-Apr-2022
    • (2022)Discovering Top-k Profitable Patterns for Smart ManufacturingCompanion Proceedings of the Web Conference 202210.1145/3487553.3524706(956-964)Online publication date: 25-Apr-2022
    • (2022)Efficient approach of sliding window-based high average-utility pattern mining with list structuresKnowledge-Based Systems10.1016/j.knosys.2022.109702256:COnline publication date: 28-Nov-2022
    • Show More Cited By

    View Options

    View options

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media