Abstract
Mining association rules is an important technique for discovering meaningful patterns in transaction databases. In the current literature, the properties of algorithms to mine association rules are discussed in great detail. We present a simple probabilistic framework for transaction data which can be used to simulate transaction data when no associations are present. We use such data and a real-world grocery database to explore the behavior of confidence and lift, two popular interest measures used for rule mining. The results show that confidence is systematically influenced by the frequency of the items in the left-hand-side of rules and that lift performs poorly to filter random noise in transaction data. The probabilistic data modeling approach presented in this paper not only is a valuable framework to analyze interest measures but also provides a starting point for further research to develop new interest measures which are based on statistical tests and geared towards the specific properties of transaction data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
AGGARWAL, C.C., and YU, P.S. (1998): A new framework for itemset generation. PODS 98, Symposium on Principles of Database Systems. Seattle, WA, USA, 18–24.
AGRAWAL, R., IMIELINSKI, T., and SWAMI, A. (1993): Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD International Conference on Management of Data. Washington D.C., 207–216.
BAYARDO, R.J., JR. and AGRAWAL, R. (1999): Mining the most interesting rules. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery in Databases & Data Mining (KDD99), 145–154.
BRIJS, T., SWINNEN, G., VANHOOF, K., and WETS, G. (2004): Building an association rules framework to improve product assortment decisions. Data Mining and Knowledge Discovery, 8(1):7–23.
BRIN, S., MOTWANI, R., ULLMAN, J.D., and TSUR, S. (1997): Dynamic itemset counting and implication rules for market basket data. SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data. Tucson, Arizona, USA, 255–264.
DUMOUCHEL, W., and PREGIBON, D. (2001): Empirical Bayes screening for multi-item associations. In: F. Provost and R. Srikant (Eds.): Proceedings of the ACM SIGKDD Intentional Conference on Knowledge Discovery in Databases & Data Mining (KDD01), 67–76. ACM Press
GOETHALS, B., and ZAKI, M.J. (2004): Advances in frequent itemset mining implementations: Report on FIMI’03. SIGKDD Explorations, 6(1):109–117.
HAHSLER, M., HORNIK, K., and REUTTERER, T. (2005): Implications of probabilistic data modeling for rule mining. Report 14, Research Report Series, Department of Statistics and Mathematics, Wirschaftsuniversität Wien, Augasse 2–6, 1090 Wien, Austria.
HIPP, J., GÜNTZER, U., and NAKHAEIZADEH, G. (2000): Algorithms for association rule mining — A general survey and comparison. SIGKDD Explorations, 2(2):1–58.
HRUSCHKA, H., LUKANOWICZ, M., and BUCHTA, C. (1999): Cross-category sales promotion effects. Journal of Retailing and Consumer Services, 6(2):99–105.
LAWRENCE, R.D., ALMASI, G.S., KOTLYAR, V., VIVEROS, M.S., and DURI, S. (2001): Personalization of supermarket product recommendations. Data Mining and Knowledge Discovery, 5(1/2):11–32.
LIN, W., ALVAREZ, S.A., and RUIZ, C. (2002): Efficient adaptive-support association rule mining for recommender systems. Data Mining and Knowledge Discovery, 6(1):83–105.
VAN DEN POEL, D., DE SCHAMPHELAERE, J., and WETS, G. (2004): Direct and indirect effects of retail promotions on sales and profits in the do-it-yourself market. Expert Systems with Applications, 27(1):53–62.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer Berlin · Heidelberg
About this paper
Cite this paper
Hahsler, M., Hornik, K., Reutterer, T. (2006). Implications of Probabilistic Data Modeling for Mining Association Rules. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W. (eds) From Data and Information Analysis to Knowledge Engineering. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-31314-1_73
Download citation
DOI: https://doi.org/10.1007/3-540-31314-1_73
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31313-7
Online ISBN: 978-3-540-31314-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)