Abstract
In the real world, ordered sequence data is commonly seen, and sequence analysis plays an important role in a wide range of real applications, such as market basket analysis. The weight concept helps to find more interesting sequences, whereas they may be treated as meaningless patterns in sequential pattern mining. Therefore, how to effectively discover these high weighted sequences from a quantitative sequential database is an urgent task. Based on the remaining weight concept, we propose a novel algorithm called Fast Weighted Sequential Pattern Mining (FWSPM) by utilizing an upper-bound called the remaining sequence maximum weight. Based on this upper-bound, an effective pruning strategy is designed to reduce the search space and save memory cost. Experimental results on both real and synthetic datasets show that the designed FWSPM algorithm is more efficient than the existing algorithms, and also has good scalability on large-scale datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Trans. Knowl. Data Eng. 5(6), 914–925 (1993)
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 207–216. ACM (1993)
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the 7-th International Conference on Data Engineering, pp. 3–14. IEEE (1995)
Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of the 20-th International Conference on Very Large Data Bases, pp. 487–499 (1994)
Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: Proceedings of the 8-th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 429–435 (2002)
Cai, C.H., Fu, A.W.C., Cheng, C.H., Kwong, W.W.: Mining association rules with weighted items. In: Proceedings of the International Database Engineering and Applications Symposium, pp. 68–77. IEEE (1998)
Chen, M.S., Han, J., Yu, P.S.: Data mining: an overview from a database perspective. IEEE Trans. Knowl. Data Eng. 8(6), 866–883 (1996)
Fournier-Viger, P., Lin, J.C.W., Kiran, R.U., Koh, Y.S., Thomas, R.: A survey of sequential pattern mining. Data Sci. Pattern Recogn. 1(1), 54–77 (2017)
Gan, W., Lin, J.C.W., Chao, H.C., Zhan, J.: Data mining in distributed environment: a survey. Wiley Interdisc. Rev.-Data Min. Knowl. Discov. 7(6), e1216 (2017)
Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Tseng, V.S., Yu, P.S.: A survey of utility-oriented pattern mining. IEEE Trans. Knowl. Data Eng. 33(4), 1306–1327 (2021)
Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Yu, P.S.: A survey of parallel sequential pattern mining. ACM Trans. Knowl. Discov. Data 13(3), 1–34 (2019)
Gan, W., Lin, J.C.-W., Fournier-Viger, P., Chao, H.-C., Zhan, J., Zhang, J.: Exploiting highly qualified pattern with frequency and weight occupancy. Knowl. Inf. Syst. 56(1), 165–196 (2017). https://doi.org/10.1007/s10115-017-1103-8
Gan, W., Lin, J.C.W., Zhang, J., Chao, H.C., Fujita, H., Yu, P.S.: ProUM: Projection-based utility mining on sequence data. Inf. Sci. 513, 222–240 (2020)
Gan, W., Lin, J.C.W., Zhang, J., Fournier-Viger, P., Chao, H.C., Yu, P.S.: Fast utility mining on sequence data. IEEE Trans. Cybern. 51(2), 487–500 (2021)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Disc. 8(1), 53–87 (2004)
Lan, G.-C., Hong, T.-P., Lee, H.-Y.: An efficient approach for finding weighted sequential patterns from sequence databases. Appl. Intell. 41(2), 439–452 (2014). https://doi.org/10.1007/s10489-014-0530-4
Lim, A.H., Lee, C.S.: Processing online analytics with classification and association rule mining. Knowl.-Based Syst. 23(3), 248–255 (2010)
Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P.: RWFIM: recent weighted-frequent itemsets mining. Eng. Appl. Artif. Intell. 45, 18–32 (2015)
Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P., Chao, H.C.: Mining weighted frequent itemsets without candidate generation in uncertain databases. Int. J. Inf. Technol. Dec. Mak. 16(06), 1549–1579 (2017)
Lin, J.C.-W., Gan, W., Fournier-Viger, P., Hong, T.-P., Tseng, V.S.: Weighted frequent itemset mining over uncertain databases. Appl. Intell. 44(1), 232–250 (2015). https://doi.org/10.1007/s10489-015-0703-9
Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Min. Knowl. Disc. 1(3), 241–258 (1997)
Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., Dayal, U., Hsu, M.C.: Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans. Knowl. Data Eng. 16(11), 1424–1440 (2004)
Schweizer, D., Zehnder, M., Wache, H., Witschel, H.F., Zanatta, D., Rodriguez, M.: Using consumer behavior data to reduce energy consumption in smart homes: Applying machine learning to save energy without lowering comfort of inhabitants. In: Proceedings of the 14-th International Conference on Machine Learning and Applications, pp. 1123–1129. IEEE (2015)
Srikant, R., Agrawal, R.: Mining sequential patterns: generalizations and performance improvements. In: Apers, P., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 1–17. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0014140
Wang, J., Han, J., Li, C.: Frequent closed sequence mining without candidate maintenance. IEEE Trans. Knowl. Data Eng. 19(8), 1042–1056 (2007)
Yun, U., Leggett, J.J.: WSpan: Weighted sequential pattern mining in large sequence databases. In: Proceedings of the 3rd International Conference Intelligent Systems, pp. 512–517. IEEE (2006)
Zhang, C., Du, Z., Gan, W., Yu, P.S.: TKUS: mining top-\(k\) high utility sequential patterns. Inf. Sci. 570, 342–359 (2021)
Acknowledgment
This research was supported in part by the National Natural Science Foundation of China (Grant Nos. 61902079 and 62002136), Guangzhou Basic and Applied Basic Research Foundation (Grant Nos. 202102020928 and 202102020277), and the Young Scholar Program of Pazhou Lab (Grant No. PZL2021KF0023).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Ye, Z., Li, Z., Guo, W., Gan, W., Wan, S., Chen, J. (2022). Fast Weighted Sequential Pattern Mining. In: Fujita, H., Fournier-Viger, P., Ali, M., Wang, Y. (eds) Advances and Trends in Artificial Intelligence. Theory and Practices in Artificial Intelligence. IEA/AIE 2022. Lecture Notes in Computer Science(), vol 13343. Springer, Cham. https://doi.org/10.1007/978-3-031-08530-7_68
Download citation
DOI: https://doi.org/10.1007/978-3-031-08530-7_68
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08529-1
Online ISBN: 978-3-031-08530-7
eBook Packages: Computer ScienceComputer Science (R0)