Abstract
When plugged into instant interactive data analytics processes, pattern mining algorithms are required to produce small collections of high quality patterns in short amounts of time. In the case of Exceptional Model Mining (EMM), even heuristic approaches like beam search can fail to deliver this requirement, because in EMM each search step requires a relatively expensive model induction. In this work, we extend previous work on high performance controlled pattern sampling by introducing extra weighting functionality, to give more importance to certain data records in a dataset. We use the extended framework to quickly obtain patterns that are likely to show highly deviating models. Additionally, we combine this randomized approach with a heuristic pruning procedure that optimizes the pattern quality further. Experiments show that in contrast to traditional beam search, this combined method is able to find higher quality patterns using short time budgets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hasan, M.A., Zaki, M.J.: Output space sampling for graph patterns. In: Proc. VLDB Endow, pp. 730–741 (2009)
Bache, K., Lichman, M.: UCI machine learning repository (2013)
Blumenstock, A., Hipp, J., Kempe, S., Lanquillon, C., Wirth, R.: Interactivity closes the gap. In: Proc. ACM SIGKDD 2006 Workshop on Data Mining for Business Applications (2006)
Boley, M., Lucchese, C., Paurat, D., Gärtner, T.: Direct local pattern sampling by efficient two–step random procedures. In: Proc. ACM SIGKDD 2011 (2011)
Boley, M., Mampaey, M., Kang, B., Tokmakov, P., Wrobel, S.: One click mining: Interactive local pattern discovery through implicit preference and performance learning. In: Proc. ACM SIGKDD 2013 Workshop IDEA, pp. 27–35. ACM (2013)
Boley, M., Moens, S., Gärtner, T.: Linear space direct pattern sampling using coupling from the past. In: Proc. ACM SIGKDD 2012, pp. 69–77. ACM (2012)
Chaoji, V., Hasan, M.A., Salem, S., Besson, J., Zaki, M.J.: Origami: A novel and effective approach for mining representative orthogonal graph patterns. In: Stat. Anal. Data Min., pp. 67–84 (2008)
Duivesteijn, W.: Exceptional model mining. PhD thesis, Leiden Institute of Advanced Computer Science (LIACS), Faculty of Science, Leiden University (2013)
Dzyuba, V., van Leeuwen, M.: Interactive discovery of interesting subgroup sets. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds.) IDA 2013. LNCS, vol. 8207, pp. 150–161. Springer, Heidelberg (2013)
Goethals, B., Moens, S., Vreeken, J.: Mime: a framework for interactive visual pattern mining. In: Proc. ACM SIGKDD 2011, pp. 757–760. ACM (2011)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Herrera, F., Carmona, C.J., González, P., del Jesus, M.J.: An overview on subgroup discovery: Foundations and applications. Knowl. Inf. Syst., 495–525 (2011)
Moens, S., Goethals, B.: Randomly sampling maximal itemsets. In: Proc. ACM SIGKDD 2013 Workshop IDEA, pp. 79–86 (2013)
Škrabal, R., Šimůnek, M., Vojíř, S., Hazucha, A., Marek, T., Chudán, D., Kliegr, T.: Association Rule Mining Following the Web Search Paradigm. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part II. LNCS, vol. 7524, pp. 808–811. Springer, Heidelberg (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Moens, S., Boley, M. (2014). Instant Exceptional Model Mining Using Weighted Controlled Pattern Sampling. In: Blockeel, H., van Leeuwen, M., Vinciotti, V. (eds) Advances in Intelligent Data Analysis XIII. IDA 2014. Lecture Notes in Computer Science, vol 8819. Springer, Cham. https://doi.org/10.1007/978-3-319-12571-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-12571-8_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12570-1
Online ISBN: 978-3-319-12571-8
eBook Packages: Computer ScienceComputer Science (R0)