Abstract
Mining high utility itemsets is a keystone in several data analysis tasks. High Utility Itemset Mining generalizes the frequent itemset mining problem by considering item quantities and weights. A high utility itemset is a set of items that appears in the transadatabase and having a high importance to the user, measured by a utility function. The utility of a pattern can be quantified in terms of various objective criteria, e.g., profit, frequency, and weight. Constraint Programming (CP) and Propositional Satisfiability (SAT) based frameworks for modeling and solving pattern mining tasks have gained a considerable attention in recent few years. This paper introduces the first declarative framework for mining high utility itemsets from transaction databases. First, we model the problem of mining high utility itemsets from transaction databases as a propositional satifiability problem. Moreover, to facilitate the mining task, we add an additional constraint to the efficiency of our method by using weighted clique cover problem. Then, we exploit the efficient SAT solving techniques to output all the high utility itemsets in the data that satisfy a user-specified minimum support and minimum utility values. Experimental evaluations on real and synthetic datasets show that the performance of our proposed approach is close to that of the optimal case of state-of-the-art HUIM algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Erwin, A., Gopalan, R.P., Achuthan, N.R.: Efficient mining of high utility itemsets from large datasets. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 554–561. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68125-0_50
Chan, R., Yang, Q., Shen, Y.: Mining high utility itemsets. In: Proceedings of the IEEE Third International Conference on Data Mining, pp. 19–26, November (2003)
Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S., Lee, Y.-K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21, 1708–1721 (2009)
Shie, B.-E., Hsiao, H.-F., Tseng, V.S., Yu, P.S.: Mining high utility mobile sequential patterns in mobile commerce environments. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011. LNCS, vol. 6587, pp. 224–238. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20149-3_18
Yen, S.-J., Lee, Y.-S.: Mining high utility quantitative association rules. In: Song, I.Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2007. LNCS, vol. 4654, pp. 283–292. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74553-2_26
Fournier-Viger, P., Lin, J.C.-W., Vo, B., Chi, T.T., Zhang, J., Le, H.B.: A Survey of Itemset Mining. WIREs Interdisciplinary reviews - Data Mining and Knowledge Discovery, Wiley (2017)
Chonsheng, Z., et al.: An empirical evaluation of high utility itemset mining algorithms. In: 101, pp. 91–115 (2018)
Guns, T., Nijssen, S., Raedt, L.D.: Itemset mining: a constraint programming perspective. Artif. Intell. 175, 1951–1983 (2011)
Raedt, L.D., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: ACM SIGKDD, pp. 204–212 (2008)
Coquery, E., Jabbour, S., Sais, L., Salhi, Y.: A SAT-based approach for discovering frequent, closed and maximal patterns in a sequence. In: Proceedings of ECAI, pp. 258–263 (2012)
Jabbour, S., Sais, L., Salhi, Y.: The top-k frequent closed itemset mining using top-k SAT problem. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 403–418. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_26
Liu, Y., Liao, W., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005). https://doi.org/10.1007/11430919_79
Tseng, V.S., Shie, B.-E., Wu, C.-W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transaction databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, p. 5564 (2012)
Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)
Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Proceedings of the 21st International Symposium on Methodologies for Intelligent System, p. 8392 (2014)
Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., V.S. Tseng: EFIM: A Highly Ecient Algorithm for High-Utility Itemset Mining
Peng, A.Y., Koh, Y.S., Riddle, P.: mHUIMiner: a fast high utility itemset mining algorithm for sparse datasets. In: Kim, J., Shim, K., Cao, L., Lee, J.-G., Lin, X., Moon, Y.-S. (eds.) PAKDD 2017. LNCS (LNAI), vol. 10235, pp. 196–207. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57529-2_16
Duong, Q.-H., Fournier-Viger, P., Ramampiaro, H., Nørvåg, K., Dam, T.-L.: Efficient high utility itemset mining using buffered utility-lists. Appl. Intell. 48(7), 1859–1877 (2017). https://doi.org/10.1007/s10489-017-1057-2
Tseitin, G.: On the complexity of derivations in the propositional calculus. In: Structures in Constructives Mathematics and Mathematical Logic, Part II, pp. 115–125 (1968)
Hsu, W.-L., Nemhauser, G.L.: A polynomial algorithm for the minimum weighted clique cover problem on claw-free perfect graphs. Discrete Math. 38(1), 65–71 (1982)
Golumbic, M.C., Stern, M., Levy, A., Morgenstern, G. (eds.): WG 2012. LNCS, vol. 7551. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34611-8
Fournier-Viger, P.: SPMF: a Java open-source data mining library. www.philippe-fournier-viger.com/spmf/. Accessed 15 Aug 2018
Boudane, A., Jabbour, S., Sais, L., Salhi, Y.: A SAT-based approach for mining association rules. In: Proceedings of IJCAI, pp. 2472–2478 (2016)
Boudane, A., Jabbour, S., Sais, L., Salhi, Y.: Enumerating non-redundant association rules using satisfiability. In: Kim, J., Shim, K., Cao, L., Lee, J.-G., Lin, X., Moon, Y.-S. (eds.) PAKDD 2017. LNCS (LNAI), vol. 10234, pp. 824–836. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57454-7_64
Cheng, J., Ke, Y., Ada Wai-Chee, F., Yu, J.X., Zhu, L.: Finding maximal cliques in massive networks. ACM Trans. Database Syst. 36, 4 (2011)
Eblen, J.D., Phillips, C.A., Rogers, G.L., et al.: The maximum clique enumeration problem: algorithms, applications, and implementations. BMC Bioinform. 13, S5 (2012)
Jabbour, S., et al.: Boolean satisfiability for sequence mining. In: Proceedings of CIKM 2013, pp. 649–658 (2013)
Fournier-Viger, P., Chun-Wei Lin, J., Truong-Chi, T., Nkambou, R.: A survey of high utility itemset mining. In: Fournier-Viger, P., Lin, J.C.-W., Nkambou, R., Vo, B., Tseng, V.S. (eds.) High-Utility Pattern Mining. SBD, vol. 51, pp. 1–45. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04921-8_1
Sörensson, N.E.: An Extensible SAT-solver. In: Proceedings of SAT, pp. 502–518 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Hidouri, A., Jabbour, S., Raddaoui, B., Yaghlane, B.B. (2020). A SAT-Based Approach for Mining High Utility Itemsets from Transaction Databases. In: Song, M., Song, IY., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2020. Lecture Notes in Computer Science(), vol 12393. Springer, Cham. https://doi.org/10.1007/978-3-030-59065-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-59065-9_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59064-2
Online ISBN: 978-3-030-59065-9
eBook Packages: Computer ScienceComputer Science (R0)