Gaussian kernel width exploration and cone cluster labeling for support vector clustering

Sei-Hyung Lee¹ &
Karen M. Daniels²

411 Accesses
Explore all metrics

Abstract

The process of clustering groups together data points so that intra-cluster similarity is maximized while inter-cluster similarity is minimized. Support vector clustering (SVC) is a clustering approach that can identify arbitrarily shaped cluster boundaries. The execution time of SVC depends heavily on several factors: choice of the width of a kernel function that determines a nonlinear transformation of the input data, solution of a quadratic program, and the way that the output of the quadratic program is used to produce clusters. This paper builds on our prior SVC research in two ways. First, we propose a method for identifying a kernel width value in a region where our experiments suggest that clustering structure is changing significantly. This can form the starting point for efficient exploration of the space of kernel width values. Second, we offer a technique, called cone cluster labeling, that uses the output of the quadratic program to build clusters in a novel way that avoids an important deficiency present in previous methods. Our experimental results use both two-dimensional and high-dimensional data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

A Kernel Clustering Algorithm Based on Diameters

Kernel-Based Clustering Driven by Density Index

Robust and compact maximum margin clustering for high-dimensional data

Article Open access 17 January 2024

Notes

MATLAB is a registered trademark of The MathWorks, Inc..
The feature space is a Hilbert space, so the Pythagorean theorem holds.
MATLAB is a registered trademark of The MathWorks, Inc.

References

Ben-Hur A, Horn D, Siegelmann HT, Vapnik V (2001) Support vector clustering. J Mach Learning Res 2:125–137
Google Scholar
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, New York
Google Scholar
Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley-Interscience, New York
MATH Google Scholar
Dong J, Krzyz ak A, Suen C (2005) Fast SVM training algorithm with decomposition on very large data sets. IEEE Trans Pattern Anal Mach Intell 27(4):603–618
Article Google Scholar
Estivill-Castro V (2002) Why so many clustering algorithms—a position paper. SIGKDD Explorations 4(1):65–75
Article MathSciNet Google Scholar
Estivill-Castro V, Lee I (2000) Automatic clustering via boundary extraction for mining massive point-data sets. In: Proceedings of the 5th international conference on geocomputation
Estivill-Castro V, Lee I (2000) Hierarchical clustering based on spatial proximity using delaunay diagram. In: Proceedings of 9th international symposium on spatial data handling, pp 7a.26–7a.41
Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd international conference on knowledge discovery and data mining (KDD-96), Portland, pp 226–231
Everitt BS, Landau S, Leese M (2001) Cluster analysis, 4th edn. Oxford University Press, New York
MATH Google Scholar
Fasulo D (1999) An analysis of recent work on clustering algorithms. Technical report 01-03-02, University of Washington
Han J, Kamber M (2001) Data mining: concepts and techniques. Morgan Kaufmann, San Francisco
Harel D, Koren Y (2001) Clustering spatial data using random walks. In: Proceedings of knowledge discovery and data mining (KDD’01), pp 281–286
Horn D (2001) Clustering via Hilbert space. Physica A 302:70–79
Article MathSciNet MATH Google Scholar
Hartuv E, Shamir R (2000) A clustering algorithm based on graph connectivity. Inf Process Lett 76(200):175–181
Article MathSciNet MATH Google Scholar
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York
MATH Google Scholar
Jonyer I, Holder LB, Cook DJ (2001) Graph-based hierarchical conceptual clustering. Int J Artif Intell Tools 10(1–2):107–135
Article Google Scholar
Jain A, Murty M, Flynn P (1999) Data clustering: a review. ACM Comput Surveys 31:264–323
Article Google Scholar
Lee S, Daniels K (2004) Gaussian kernel width exploration in support vector clustering. Technical report 2004-009, University of Massachusetts Lowell, Department of Computer Science
Lee S, Daniels K (2005) Gaussian kernel width generator for support vector clustering. In: He M, Narasimhan G, Petoukhov S (eds) Proceedings, international conference on bioinformatics and its applications and advances in bioinformatics and its applications. Advances in bioinformatics and its applications. Series in mathematical biology and medicine, vol 8. World Scientific, pp 151–162
Lee S, Daniels K (2006) Cone cluster labeling for support vector clustering. In: Proceedings of 2006 SIAM conference on data mining, pp 484–488
Lee S (2005) Gaussian kernel width selection and fast cluster labeling for support vector clustering. Doctoral thesis and Technical report 2005-009, University of Massachusetts Lowell, Department of Computer Science
Lee J, Lee D (2005) An improved cluster labeling method for support vector clustering. IEEE Trans Pattern Anal Mach Intell 27:461–464
Article Google Scholar
Mortenson M (2006) Geometric modeling, 3rd edn. Industrial Press Inc, New York
Google Scholar
Newman DJ, Hettich S, Blake CL, Merz CJ (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/∼mlearn/mlrepository.html
Platt J (1999) Fast training of support vector machines using sequential minimal optimization. In: Scholkopf B, Burges CJC, Smola AJ (eds) Advances in kernel methods—support vector learning. MIT Press, Cambridge, pp 185–208
Preparata FP, Shamos MI (1985) Computational geometry. Springer, New York
Google Scholar
Vapnik VN (1995) The nature of statistical learning theory, 2nd edn. Springer, New York
MATH Google Scholar
Yang J, Estivill-Castro V, Chalup SK (2002) Support vector clustering through proximity graph modeling. In: Proceedings of 9th international conference on neural information processing (ICONIP’02), pp 898–903

Download references

Author information

Authors and Affiliations

EMC, Hopkinton, MA, 01748, USA
Sei-Hyung Lee
Department of Computer Science, University of Massachusetts Lowell, Lowell, MA, 01854, USA
Karen M. Daniels

Authors

Sei-Hyung Lee
View author publications
You can also search for this author in PubMed Google Scholar
Karen M. Daniels
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sei-Hyung Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, SH., Daniels, K.M. Gaussian kernel width exploration and cone cluster labeling for support vector clustering. Pattern Anal Applic 15, 327–344 (2012). https://doi.org/10.1007/s10044-011-0244-8

Download citation

Received: 21 April 2010
Accepted: 16 September 2011
Published: 04 October 2011
Issue Date: August 2012
DOI: https://doi.org/10.1007/s10044-011-0244-8

Gaussian kernel width exploration and cone cluster labeling for support vector clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Kernel Clustering Algorithm Based on Diameters

Kernel-Based Clustering Driven by Density Index

Robust and compact maximum margin clustering for high-dimensional data

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Gaussian kernel width exploration and cone cluster labeling for support vector clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Kernel Clustering Algorithm Based on Diameters

Kernel-Based Clustering Driven by Density Index

Robust and compact maximum margin clustering for high-dimensional data

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now