Abstract
Knowledge discovery in databases is a complex, iterative, and highly interactive process. When mining for association rules, typically interactivity is largely smothered by the execution times of the rule generation algorithms. Our approach is to accept a single, possibly expensive run, but all subsequent mining queries are supposed to be answered interactively by accessing a sophisticated rule cache. However there are two critical aspects. First, access to the cache must be efficient and comfortable. Therefore we enrich the basic association mining framework by descriptions of items through application dependent attributes. Furthermore we extend current mining query languages to deal with these attributes through ∃ and ∀ quantifiers. Second, the cache must be prepared to answer a broad variety of queries without rerunning the mining algorithm. A main contribution of this paper is that we show how to postpone restrict operations on the transactions from rule generation to rule retrieval from the cache. That is, without actually rerunning the algorithm, we efficiently construct those rules from the cache that would have been generated if the mining algorithm were run on only a subset of the transactions. In addition we describe how we implemented our ideas on a conventional relational database system. We evaluate our prototype concerning response times in a pilot application at DaimlerChrysler. It turns out to satisfy easily the demands of interactive data mining.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proc. of the ACM SIGMOD Int’l Conf. on Management of Data (ACM SIGMOD’ 93), pages 207–216, Washington, USA, May 1993.
R. J. Brachman and T. Anand. The process of knowledge discovery in databases: A human centered approach. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, chapter 2, pages 37–57. AAAI/MIT Press, 1996.
S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket data. In Proc. of the ACM SIGMOD Int’l Conf. on Management of Data (ACM SIGMOD’ 97), pages 265–276, 1997.
U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM, 39(11):27–34, November 1996.
J. Han, Y. Fu, W. Wang, K. Koperski, and O. Zaiane. DMQL: A data mining query language for relational databases. In Proc. of the 1996 SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’ 96), Montreal, Canada, June 1996.
J. Hipp, U. Güntzer, and U. Grimmer. Integrating association rule mining algorithms with relational database systems. In Proc. of the 3rd Int’l Conf. on Enterprise Information Systems (ICEIS 2001), pages 130–137, Portugal, July 2001.
J. Hipp, U. Güntzer, and G. Nakhaeizadeh. Algorithms for association rule mining-a general survey and comparison. SIGKDD Explorations, 2(1):58–64, July 2000.
J. Hipp and G. Lindner. Analysing warranty claims of automobiles. an application description following the CRISP-DM data mining process. In Proc. of 5th Int’l Computer Science Conf. (ICSC’ 99), pages 31–40, Hong Kong, China, December 13–15 1999.
E. Hotz, G. Nakhaeizadeh, B. Petzsche, and H. Spiegelberger. Waps, a data mining support environment for the planning of warranty and goodwill costs in the automobile industry. In Proc. of the 5th Int’l Conf. on Knowledge Discovery and Data Mining (KDD’ 99), pages 417–419, San Diego, California, USA, August 1999.
T. Imielinski, A. Virmani, and A. Abdulghani. Data mining: Application programming interface and query language for database mining. In Proc. of the 2nd Int’l Conf. on Knowledge Discovery in Databases and Data Mining (KDD’ 96), pages 256–262, Portland, Oregon, USA, August 1996.
T. Imielinski, A. Virmani, and A. Abdulghani. DMajor-application programming interface for database mining. Data Mining and Knowledge Discovery, 3(4):347–372, December 1999.
M. Klemettinen, H. Mannila, and H. Toivonen. Interactive exploration of discovered knowledge: A methodology for interaction, and usability studies. Technical Report C-1996-3, University Of Helsinki, Department of Computer Science, P.O. 26, 1996
B. Liu, M. Hu, and W. Hsu. Multi-level organisation and summarization of the discovered rules. In Proc. of the 6th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining (KDD’ 00), pages 208–217, Boston, MA USA, August 20–23 2000.
R. Meo, G. Psaila, and S. Ceri. A new sql-like operator for mining association rules. In Proc. of the 22nd Int’l Conf. on Very Large Databases (VLDB’ 96), Mumbai (Bombay), India, September 1996.
R. Ng, L. S. Lakshmanan, J. Han, and T. Mah. Exploratory mining via constrained frequent set queries. In Proc. of the 1999 ACM-SIGMOD Int’l Conf. on Management of Data (SIGMOD’99), pages 556–558, Philadelphia, PA, USA, June 1999.
S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with relational database systems: Alternatives and implications. SIGMOD Record (ACM Special Interest Group on Management of Data), 27(2):343–355, 1998.
R. Srikant and R. Agrawal. Mining generalized association rules. In Proc. of the 21st Conf. on Very Large Databases (VLDB’ 95), Zürich, Switzerland, Sept. 1995.
R. Srikant and R. Agrawal. Mining quantitative association rules in large relational tables. In Proc. of the 1996 ACM SIGMOD Int’l Conf. on Management of Data (SIGMOD’ 96), Montreal, Canada, June 1996.
R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. In Proc. of the 3rd Int’l Conf. on KDD and Data Mining (KDD’ 97), Newport Beach, California, August 1997.
R. Wirth and J. Hipp. CRISP-DM: Towards a standard process modell for data mining. In Proc. of the 4th Int’l Conf. on the Practical Applications of Knowledge Discovery and Data Mining, pages 29–39, Manchester, UK, April 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hipp, J., Mangold, C., Güntzer, U., Nakhaeizadeh, G. (2002). Efficient Rule Retrieval and Postponed Restrict Operations for Association Rule Mining. In: Chen, MS., Yu, P.S., Liu, B. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2002. Lecture Notes in Computer Science(), vol 2336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47887-6_6
Download citation
DOI: https://doi.org/10.1007/3-540-47887-6_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43704-8
Online ISBN: 978-3-540-47887-4
eBook Packages: Springer Book Archive