A SAT-Based Approach for Mining High Utility Itemsets from Transaction Databases

Amel Hidouri^13,14,
Said Jabbour¹⁴,
Badran Raddaoui¹⁵ &
…
Boutheina Ben Yaghlane¹³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12393))

Included in the following conference series:

International Conference on Big Data Analytics and Knowledge Discovery

1172 Accesses
5 Citations

Abstract

Mining high utility itemsets is a keystone in several data analysis tasks. High Utility Itemset Mining generalizes the frequent itemset mining problem by considering item quantities and weights. A high utility itemset is a set of items that appears in the transadatabase and having a high importance to the user, measured by a utility function. The utility of a pattern can be quantified in terms of various objective criteria, e.g., profit, frequency, and weight. Constraint Programming (CP) and Propositional Satisfiability (SAT) based frameworks for modeling and solving pattern mining tasks have gained a considerable attention in recent few years. This paper introduces the first declarative framework for mining high utility itemsets from transaction databases. First, we model the problem of mining high utility itemsets from transaction databases as a propositional satifiability problem. Moreover, to facilitate the mining task, we add an additional constraint to the efficiency of our method by using weighted clique cover problem. Then, we exploit the efficient SAT solving techniques to output all the high utility itemsets in the data that satisfy a user-specified minimum support and minimum utility values. Experimental evaluations on real and synthetic datasets show that the performance of our proposed approach is close to that of the optimal case of state-of-the-art HUIM algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

A Comparative Study of SAT-Based Itemsets Mining

MMC: efficient and effective closed high-utility itemset mining

Article 24 May 2024

References

Erwin, A., Gopalan, R.P., Achuthan, N.R.: Efficient mining of high utility itemsets from large datasets. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 554–561. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68125-0_50
Chapter Google Scholar
Chan, R., Yang, Q., Shen, Y.: Mining high utility itemsets. In: Proceedings of the IEEE Third International Conference on Data Mining, pp. 19–26, November (2003)
Google Scholar
Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S., Lee, Y.-K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21, 1708–1721 (2009)
Article Google Scholar
Shie, B.-E., Hsiao, H.-F., Tseng, V.S., Yu, P.S.: Mining high utility mobile sequential patterns in mobile commerce environments. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011. LNCS, vol. 6587, pp. 224–238. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20149-3_18
Chapter Google Scholar
Yen, S.-J., Lee, Y.-S.: Mining high utility quantitative association rules. In: Song, I.Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2007. LNCS, vol. 4654, pp. 283–292. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74553-2_26
Chapter Google Scholar
Fournier-Viger, P., Lin, J.C.-W., Vo, B., Chi, T.T., Zhang, J., Le, H.B.: A Survey of Itemset Mining. WIREs Interdisciplinary reviews - Data Mining and Knowledge Discovery, Wiley (2017)
Google Scholar
Chonsheng, Z., et al.: An empirical evaluation of high utility itemset mining algorithms. In: 101, pp. 91–115 (2018)
Google Scholar
Guns, T., Nijssen, S., Raedt, L.D.: Itemset mining: a constraint programming perspective. Artif. Intell. 175, 1951–1983 (2011)
Article MathSciNet Google Scholar
Raedt, L.D., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: ACM SIGKDD, pp. 204–212 (2008)
Google Scholar
Coquery, E., Jabbour, S., Sais, L., Salhi, Y.: A SAT-based approach for discovering frequent, closed and maximal patterns in a sequence. In: Proceedings of ECAI, pp. 258–263 (2012)
Google Scholar
Jabbour, S., Sais, L., Salhi, Y.: The top-k frequent closed itemset mining using top-k SAT problem. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 403–418. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_26
Chapter Google Scholar
Liu, Y., Liao, W., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005). https://doi.org/10.1007/11430919_79
Chapter Google Scholar
Tseng, V.S., Shie, B.-E., Wu, C.-W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transaction databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)
Article Google Scholar
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, p. 5564 (2012)
Google Scholar
Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)
Article Google Scholar
Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Proceedings of the 21st International Symposium on Methodologies for Intelligent System, p. 8392 (2014)
Google Scholar
Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., V.S. Tseng: EFIM: A Highly Ecient Algorithm for High-Utility Itemset Mining
Google Scholar
Peng, A.Y., Koh, Y.S., Riddle, P.: mHUIMiner: a fast high utility itemset mining algorithm for sparse datasets. In: Kim, J., Shim, K., Cao, L., Lee, J.-G., Lin, X., Moon, Y.-S. (eds.) PAKDD 2017. LNCS (LNAI), vol. 10235, pp. 196–207. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57529-2_16
Chapter Google Scholar
Duong, Q.-H., Fournier-Viger, P., Ramampiaro, H., Nørvåg, K., Dam, T.-L.: Efficient high utility itemset mining using buffered utility-lists. Appl. Intell. 48(7), 1859–1877 (2017). https://doi.org/10.1007/s10489-017-1057-2
Article Google Scholar
Tseitin, G.: On the complexity of derivations in the propositional calculus. In: Structures in Constructives Mathematics and Mathematical Logic, Part II, pp. 115–125 (1968)
Google Scholar
Hsu, W.-L., Nemhauser, G.L.: A polynomial algorithm for the minimum weighted clique cover problem on claw-free perfect graphs. Discrete Math. 38(1), 65–71 (1982)
Article MathSciNet Google Scholar
Golumbic, M.C., Stern, M., Levy, A., Morgenstern, G. (eds.): WG 2012. LNCS, vol. 7551. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34611-8
Book Google Scholar
Fournier-Viger, P.: SPMF: a Java open-source data mining library. www.philippe-fournier-viger.com/spmf/. Accessed 15 Aug 2018
Boudane, A., Jabbour, S., Sais, L., Salhi, Y.: A SAT-based approach for mining association rules. In: Proceedings of IJCAI, pp. 2472–2478 (2016)
Google Scholar
Boudane, A., Jabbour, S., Sais, L., Salhi, Y.: Enumerating non-redundant association rules using satisfiability. In: Kim, J., Shim, K., Cao, L., Lee, J.-G., Lin, X., Moon, Y.-S. (eds.) PAKDD 2017. LNCS (LNAI), vol. 10234, pp. 824–836. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57454-7_64
Chapter Google Scholar
Cheng, J., Ke, Y., Ada Wai-Chee, F., Yu, J.X., Zhu, L.: Finding maximal cliques in massive networks. ACM Trans. Database Syst. 36, 4 (2011)
Article Google Scholar
Eblen, J.D., Phillips, C.A., Rogers, G.L., et al.: The maximum clique enumeration problem: algorithms, applications, and implementations. BMC Bioinform. 13, S5 (2012)
Article Google Scholar
Jabbour, S., et al.: Boolean satisfiability for sequence mining. In: Proceedings of CIKM 2013, pp. 649–658 (2013)
Google Scholar
Fournier-Viger, P., Chun-Wei Lin, J., Truong-Chi, T., Nkambou, R.: A survey of high utility itemset mining. In: Fournier-Viger, P., Lin, J.C.-W., Nkambou, R., Vo, B., Tseng, V.S. (eds.) High-Utility Pattern Mining. SBD, vol. 51, pp. 1–45. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04921-8_1
Chapter Google Scholar
Sörensson, N.E.: An Extensible SAT-solver. In: Proceedings of SAT, pp. 502–518 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

LARODEC, Univeristy of Tunis High Institut of Management, Cite Bouchoucha, Tunisia
Amel Hidouri & Boutheina Ben Yaghlane
CRIL - CNRS UMR 8188, University of Artois, Arras, France
Amel Hidouri & Said Jabbour
SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, Paris, France
Badran Raddaoui

Authors

Amel Hidouri
View author publications
You can also search for this author in PubMed Google Scholar
Said Jabbour
View author publications
You can also search for this author in PubMed Google Scholar
Badran Raddaoui
View author publications
You can also search for this author in PubMed Google Scholar
Boutheina Ben Yaghlane
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Said Jabbour .

Editor information

Editors and Affiliations

Department of Library and Information, Yonsei University, Seoul, Korea (Republic of)
Min Song
Drexel University, Philadelphia, PA, USA
Il-Yeol Song
Johannes Kepler University of Linz, Linz, Austria
Gabriele Kotsis
Software Competence Center Hagenberg (Au), Vienna, Wien, Austria
A Min Tjoa
Johannes Kepler University of Linz, Linz, Oberösterreich, Austria
Ismail Khalil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hidouri, A., Jabbour, S., Raddaoui, B., Yaghlane, B.B. (2020). A SAT-Based Approach for Mining High Utility Itemsets from Transaction Databases. In: Song, M., Song, IY., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2020. Lecture Notes in Computer Science(), vol 12393. Springer, Cham. https://doi.org/10.1007/978-3-030-59065-9_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-59065-9_8
Published: 11 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59064-2
Online ISBN: 978-3-030-59065-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics