[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

A SAT-Based Approach for Mining High Utility Itemsets from Transaction Databases

  • Conference paper
  • First Online:
Big Data Analytics and Knowledge Discovery (DaWaK 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12393))

Included in the following conference series:

Abstract

Mining high utility itemsets is a keystone in several data analysis tasks. High Utility Itemset Mining generalizes the frequent itemset mining problem by considering item quantities and weights. A high utility itemset is a set of items that appears in the transadatabase and having a high importance to the user, measured by a utility function. The utility of a pattern can be quantified in terms of various objective criteria, e.g., profit, frequency, and weight. Constraint Programming (CP) and Propositional Satisfiability (SAT) based frameworks for modeling and solving pattern mining tasks have gained a considerable attention in recent few years. This paper introduces the first declarative framework for mining high utility itemsets from transaction databases. First, we model the problem of mining high utility itemsets from transaction databases as a propositional satifiability problem. Moreover, to facilitate the mining task, we add an additional constraint to the efficiency of our method by using weighted clique cover problem. Then, we exploit the efficient SAT solving techniques to output all the high utility itemsets in the data that satisfy a user-specified minimum support and minimum utility values. Experimental evaluations on real and synthetic datasets show that the performance of our proposed approach is close to that of the optimal case of state-of-the-art HUIM algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Erwin, A., Gopalan, R.P., Achuthan, N.R.: Efficient mining of high utility itemsets from large datasets. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 554–561. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68125-0_50

    Chapter  Google Scholar 

  2. Chan, R., Yang, Q., Shen, Y.: Mining high utility itemsets. In: Proceedings of the IEEE Third International Conference on Data Mining, pp. 19–26, November (2003)

    Google Scholar 

  3. Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S., Lee, Y.-K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21, 1708–1721 (2009)

    Article  Google Scholar 

  4. Shie, B.-E., Hsiao, H.-F., Tseng, V.S., Yu, P.S.: Mining high utility mobile sequential patterns in mobile commerce environments. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011. LNCS, vol. 6587, pp. 224–238. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20149-3_18

    Chapter  Google Scholar 

  5. Yen, S.-J., Lee, Y.-S.: Mining high utility quantitative association rules. In: Song, I.Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2007. LNCS, vol. 4654, pp. 283–292. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74553-2_26

    Chapter  Google Scholar 

  6. Fournier-Viger, P., Lin, J.C.-W., Vo, B., Chi, T.T., Zhang, J., Le, H.B.: A Survey of Itemset Mining. WIREs Interdisciplinary reviews - Data Mining and Knowledge Discovery, Wiley (2017)

    Google Scholar 

  7. Chonsheng, Z., et al.: An empirical evaluation of high utility itemset mining algorithms. In: 101, pp. 91–115 (2018)

    Google Scholar 

  8. Guns, T., Nijssen, S., Raedt, L.D.: Itemset mining: a constraint programming perspective. Artif. Intell. 175, 1951–1983 (2011)

    Article  MathSciNet  Google Scholar 

  9. Raedt, L.D., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: ACM SIGKDD, pp. 204–212 (2008)

    Google Scholar 

  10. Coquery, E., Jabbour, S., Sais, L., Salhi, Y.: A SAT-based approach for discovering frequent, closed and maximal patterns in a sequence. In: Proceedings of ECAI, pp. 258–263 (2012)

    Google Scholar 

  11. Jabbour, S., Sais, L., Salhi, Y.: The top-k frequent closed itemset mining using top-k SAT problem. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 403–418. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_26

    Chapter  Google Scholar 

  12. Liu, Y., Liao, W., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005). https://doi.org/10.1007/11430919_79

    Chapter  Google Scholar 

  13. Tseng, V.S., Shie, B.-E., Wu, C.-W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transaction databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)

    Article  Google Scholar 

  14. Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, p. 5564 (2012)

    Google Scholar 

  15. Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)

    Article  Google Scholar 

  16. Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Proceedings of the 21st International Symposium on Methodologies for Intelligent System, p. 8392 (2014)

    Google Scholar 

  17. Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., V.S. Tseng: EFIM: A Highly Ecient Algorithm for High-Utility Itemset Mining

    Google Scholar 

  18. Peng, A.Y., Koh, Y.S., Riddle, P.: mHUIMiner: a fast high utility itemset mining algorithm for sparse datasets. In: Kim, J., Shim, K., Cao, L., Lee, J.-G., Lin, X., Moon, Y.-S. (eds.) PAKDD 2017. LNCS (LNAI), vol. 10235, pp. 196–207. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57529-2_16

    Chapter  Google Scholar 

  19. Duong, Q.-H., Fournier-Viger, P., Ramampiaro, H., Nørvåg, K., Dam, T.-L.: Efficient high utility itemset mining using buffered utility-lists. Appl. Intell. 48(7), 1859–1877 (2017). https://doi.org/10.1007/s10489-017-1057-2

    Article  Google Scholar 

  20. Tseitin, G.: On the complexity of derivations in the propositional calculus. In: Structures in Constructives Mathematics and Mathematical Logic, Part II, pp. 115–125 (1968)

    Google Scholar 

  21. Hsu, W.-L., Nemhauser, G.L.: A polynomial algorithm for the minimum weighted clique cover problem on claw-free perfect graphs. Discrete Math. 38(1), 65–71 (1982)

    Article  MathSciNet  Google Scholar 

  22. Golumbic, M.C., Stern, M., Levy, A., Morgenstern, G. (eds.): WG 2012. LNCS, vol. 7551. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34611-8

    Book  Google Scholar 

  23. Fournier-Viger, P.: SPMF: a Java open-source data mining library. www.philippe-fournier-viger.com/spmf/. Accessed 15 Aug 2018

  24. Boudane, A., Jabbour, S., Sais, L., Salhi, Y.: A SAT-based approach for mining association rules. In: Proceedings of IJCAI, pp. 2472–2478 (2016)

    Google Scholar 

  25. Boudane, A., Jabbour, S., Sais, L., Salhi, Y.: Enumerating non-redundant association rules using satisfiability. In: Kim, J., Shim, K., Cao, L., Lee, J.-G., Lin, X., Moon, Y.-S. (eds.) PAKDD 2017. LNCS (LNAI), vol. 10234, pp. 824–836. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57454-7_64

    Chapter  Google Scholar 

  26. Cheng, J., Ke, Y., Ada Wai-Chee, F., Yu, J.X., Zhu, L.: Finding maximal cliques in massive networks. ACM Trans. Database Syst. 36, 4 (2011)

    Article  Google Scholar 

  27. Eblen, J.D., Phillips, C.A., Rogers, G.L., et al.: The maximum clique enumeration problem: algorithms, applications, and implementations. BMC Bioinform. 13, S5 (2012)

    Article  Google Scholar 

  28. Jabbour, S., et al.: Boolean satisfiability for sequence mining. In: Proceedings of CIKM 2013, pp. 649–658 (2013)

    Google Scholar 

  29. Fournier-Viger, P., Chun-Wei Lin, J., Truong-Chi, T., Nkambou, R.: A survey of high utility itemset mining. In: Fournier-Viger, P., Lin, J.C.-W., Nkambou, R., Vo, B., Tseng, V.S. (eds.) High-Utility Pattern Mining. SBD, vol. 51, pp. 1–45. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04921-8_1

    Chapter  Google Scholar 

  30. Sörensson, N.E.: An Extensible SAT-solver. In: Proceedings of SAT, pp. 502–518 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Said Jabbour .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hidouri, A., Jabbour, S., Raddaoui, B., Yaghlane, B.B. (2020). A SAT-Based Approach for Mining High Utility Itemsets from Transaction Databases. In: Song, M., Song, IY., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2020. Lecture Notes in Computer Science(), vol 12393. Springer, Cham. https://doi.org/10.1007/978-3-030-59065-9_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59065-9_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59064-2

  • Online ISBN: 978-3-030-59065-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics