K-Harmonic Means -A Spatial Clustering Algorithm with Boosting

Bin Zhang³,
Meichun Hsu³ &
Umesh Dayal³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2007))

Included in the following conference series:

International Workshop on Temporal, Spatial, and Spatio-Temporal Data Mining

1108 Accesses
34 Citations

Abstract

We propose a new center-based iterative clustering algorithm, K- Harmonic Means (KHM), which is essentially insensitive to the initialization of the centers, demonstrated through a set of experiments. The dependency of the K-Means performance on the initialization of the centers has been a major problem; a similar issue exists for an alternative algorithm, Expectation Maximization (EM). Many have tried to generate good initializations to solve the sensitivity problem. KHM addresses the intrinsic problem by replacing the minimum distance from a data point to the centers, used in K-means, by the Harmonic Averages of the distances from the data point to all centers. KHM significantly improves the quality of clustering results comparing with both K- Means and EM. The KHM algorithm has been implemented in both sequential and parallel languages and tested on hundreds of randomly generated datasets with different data distribution and clustering characteristics.

Primary Contact: bzhang@hpl.hp.com. This document is released as a technical report in Oct. 1999, available at http://www.hpl.hp.com/techreports/1999/HPL-1999-124.html.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

DISCERN: diversity-based selection of centroids for k-estimation and rapid non-stochastic clustering

Article 21 September 2020

Favoring the k-Means Algorithm with Initialization Methods

Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

References

Anderberg, M. R. 1973. Cluster analysis for applications. Academic Press, New York. 35p.
Google Scholar
Bay, S. D. (1999). The UCI KDD Archive [http://kdd.ics.uci.edu]. Irvine, CA: University of California, Department of Information and Computer Science.
Google Scholar
Bezdek, Ehrlich, & Full, “FCM: THE FUZZY c-MEANS CLUSTERING ALGORITHM”, Computers & Geosciences, v.10, pp.191–203, 1984
Article Google Scholar
Bradley, P., Fayyad, U. M., and Reina, C.A., “Scaling EM Clustering to Large Databases,” MS Technical Report, 1998.
Google Scholar
Bradley, P., Fayyad, U. M., C.A., “Refining Initial Points for KM Clustering”, MS Technical Report MSR-TR-98-36, May 1998.
Google Scholar
Bradley, P., Fayyad, U.M., and Reina, C.A., “Scaling Clustering to Large Databases”, KDD98, 1998.
Google Scholar
Duda, R., Hart, P., “Pattern Classification and Scene Analysis”, John Wiley & Sons, 1972.
Google Scholar
Dempster, A. P., Laird, N.M., and Rubin, D.B., “Miximum Likelyhood from Incomplete Data via the EM Algorithm”, Journal of the Royal Statistical Society, Series B, 39(1):1–38, 1977.
MATH MathSciNet Google Scholar
Fayyad, U. M., Piatetsky-Shapiro, G. Smyth, P. and Uthurusamy, R., “Advances in Knowledge Discovery and Data Mining”, AAAI Press 1996
Google Scholar
Gersho & Gray, “Vector Quantization and Signal Compression”, KAP, 1992
Google Scholar
Gill, P.E., Murray, W. and Wright, H.M., “Practical Optimization”, Academic Press, 1981.
Google Scholar
Gonzales, T.F., “Clustering to Minimize the Maximum Intercluster Distance”, Theo. Comp. Sci. 38, p293–306, 1985.
Article Google Scholar
Kaufman, L. and Rousseeuw, P. J., “Finding Groups in Data: An Introduction to Cluster Analysis”, John Wiley & Sons, 1990.
Google Scholar
MacQueen, J. 1967. Some methods for classification and analysis of multivariate observations. Pp. 281–297 in: L. M. Le Cam & J. Neyman [eds.] Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1. University of California Press, Berkeley. xvii + 666 p.
Google Scholar
McKenzie, P. and Alder, M., “Initializing the EM Algorithm for Use in Gaussian Mixture Modeling”, The Univ. of Western Australia, Center for Information Processing Systems, Manuscript.
Google Scholar
McLachlan, G. J. and Krishnan, T., “The EM Algorithm and Extensions.”, John Wiley & Sons, Inc., 1997
Google Scholar
Pelleg, D. and Moore, A, “Accelerating Exact K-Means Algorithms with Geometric Reasoning”, KDD-99, Proc. of the Fifth ACM SIGKDD Intern. Conf. On Knowledge Discovery and Data Mining, page 277–281.
Google Scholar
Rendner, R.A. and Walker, H.F., “Mixture Densities, Maximum Likelihood and The EM Algorithm”, SIAM Review, vol. 26 #2, 1984.
Google Scholar
Selim, S.Z. and Ismail, M.A., “K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality”, IEEE Trans. On PAMI-6, #1, 1984.
Google Scholar
Zhang, B., Hsu, M., Forman, G., “Accurate Recasting of Parameter Estimation Algorithms using Sufficient Statistics for Efficient Parallel Speed-up”, To appear in PKDD 2000, September, Lyon, France.
Google Scholar

Download references

Author information

Authors and Affiliations

Hewlett-Packard Research Laboratory, USA
Bin Zhang, Meichun Hsu & Umesh Dayal

Authors

Bin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Meichun Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Umesh Dayal
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Informatics and Engineering, Flinders University of South Australia, P.O. Box 2100, 5001, Adelaide, South Australia
John F. Roddick
National Center for Geographic Information and Analysis, University of Maine, ME 04469-5711, Maine, Orono, USA
Kathleen Hornsby

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, B., Hsu, M., Dayal, U. (2001). K-Harmonic Means -A Spatial Clustering Algorithm with Boosting. In: Roddick, J.F., Hornsby, K. (eds) Temporal, Spatial, and Spatio-Temporal Data Mining. TSDM 2000. Lecture Notes in Computer Science(), vol 2007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45244-3_4

Download citation

DOI: https://doi.org/10.1007/3-540-45244-3_4
Published: 16 March 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41773-6
Online ISBN: 978-3-540-45244-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

K-Harmonic Means -A Spatial Clustering Algorithm with Boosting

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

DISCERN: diversity-based selection of centroids for k-estimation and rapid non-stochastic clustering

Favoring the k-Means Algorithm with Initialization Methods

Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

K-Harmonic Means -A Spatial Clustering Algorithm with Boosting

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

DISCERN: diversity-based selection of centroids for k-estimation and rapid non-stochastic clustering

Favoring the k-Means Algorithm with Initialization Methods

Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation