An EM-Approach for Clustering Multi-Instance Objects

Hans-Peter Kriegel²²,
Alexey Pryakhin²² &
Matthias Schubert²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3918))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3183 Accesses
12 Citations

Abstract

In many data mining applications the data objects are modeled as sets of feature vectors or multi-instance objects. In this paper, we present an expectation maximization approach for clustering multi-instance objects. We therefore present a statistical process that models multi-instance objects. Furthermore, we present M-steps and E-steps for EM clustering and a method for finding a good initial model. In our experimental evaluation, we demonstrate that the new EM algorithm is capable to increase the cluster quality for three real world data sets compared to a k-medoid clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 71.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 89.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

On the k-Medoids Model for Semi-supervised Clustering

MACOC: A Medoid-Based ACO Clustering Algorithm

$$MO-Mine_{clust}$$ : A Framework for Multi-objective Clustering

References

Dietterich, T., Lathrop, R., Lozano-Perez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence 89, 31–71 (1997)
Article MATH Google Scholar
Kriegel, H.P., Schubert, M.: Classification of websites as sets of feature vectors. In: Proc. IASTED Int. Conf. on Databases and Applications (DBA 2004), Innsbruck, Austria (2004)
Google Scholar
Zhou, Z.H.: Multi-Instance Learning: A Survey. Technical Report, AI Lab, Computer Science a. Technology Department, Nanjing University, Nanjing, China (2004)
Google Scholar
Ruffo, G.: Learning single and multiple instance decision tree for computer security applications. PhD thesis, Department of Computer Science, University of Turin, Torino, Italy (2000)
Google Scholar
Weidmann, N., Frank, E., Pfahringer, B.: A two-level learning method for generalized multi-instance problems. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS, vol. 2837, pp. 468–479. Springer, Heidelberg (2003)
Chapter Google Scholar
Eiter, T., Mannila, H.: Distance Measures for Point Sets and Their Computation. Acta Informatica 34, 103–133 (1997)
Article MathSciNet MATH Google Scholar
Ramon, J., Bruynooghe, M.: A polynomial time computable metric between points sets. Acta Informatica 37, 765–780 (2001)
Article MathSciNet MATH Google Scholar
Han, J., Kamber, M.: Data Mining Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2001)
MATH Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proc. Int. Conf. on Knowledge Discovery and Data Mining (KDD), pp. 291–316 (1996)
Google Scholar
Gärtner, T., Flach, P., Kowalczyk, A., Smola, A.: Multi-Instance Kernels, pp. 179–186 (2002)
Google Scholar
Ng, R., Han, J.: Efficient and Effective Clustering Methods for Spatial Data Mining. In: Proc. Int. Conf. on Very Large Databases (VLDB), pp. 144–155 (1994)
Google Scholar
Wang, J., Zucker, J.: Solving Multiple-Instance Problem: A Lazy Learning Approach, pp. 1119–1125 (2000)
Google Scholar
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Academic Press, London (2001)
MATH Google Scholar
Fayyad, U., Reina, C., Bradley, P.: Initialization of Iterative Refinement Clustering Algorithms. In: Proc. Int. Conf. on Knowledge Discovery in Databases (KDD) (1998)
Google Scholar
Smyth, P.: Clustering using monte carlo cross-validation. In: KDD, pp. 126–133 (1996)
Google Scholar
Wang, J.T.L., Ma, Q., Shasha, D., Wu, C.H.: New techniques for extracting features from protein sequences. IBM Syst. J. 40, 426–441 (2001)
Article Google Scholar
Newman, D.J., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Informatics, University of Munich, D-80538, Munich, Germany
Hans-Peter Kriegel, Alexey Pryakhin & Matthias Schubert

Authors

Hans-Peter Kriegel
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Pryakhin
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Schubert
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Nanyang Technological University, Singapore
Wee-Keong Ng
Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, 153-8505, Tokyo, Japan
Masaru Kitsuregawa
School of Computer Science and Technology, Heilongjiang University, China
Jianzhong Li
School of Computer Engineering, Nanyang Technological University, 639798, Singapore, Singapore
Kuiyu Chang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kriegel, HP., Pryakhin, A., Schubert, M. (2006). An EM-Approach for Clustering Multi-Instance Objects. In: Ng, WK., Kitsuregawa, M., Li, J., Chang, K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science(), vol 3918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731139_18

Download citation

DOI: https://doi.org/10.1007/11731139_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33206-0
Online ISBN: 978-3-540-33207-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An EM-Approach for Clustering Multi-Instance Objects

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

On the k-Medoids Model for Semi-supervised Clustering

MACOC: A Medoid-Based ACO Clustering Algorithm

$$MO-Mine_{clust}$$ : A Framework for Multi-objective Clustering

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

An EM-Approach for Clustering Multi-Instance Objects

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

On the k-Medoids Model for Semi-supervised Clustering

MACOC: A Medoid-Based ACO Clustering Algorithm

$$MO-Mine_{clust}$$ : A Framework for Multi-objective Clustering

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation