More Web Proxy on the site http://driver.im/

article

Proficient Normalised Fuzzy K-Means With Initial Centroids Methodology

Authors:

Deepali Virmani,

Shefali Upadhyaya,

Abhishek SrivastavAuthors Info & Claims

International Journal of Knowledge Discovery in Bioinformatics, Volume 8, Issue 1

Pages 42 - 59

https://doi.org/10.4018/IJKDB.2018010104

Published: 01 January 2018 Publication History

Abstract

This article describes how data is relevant and if it can be organized, linked with other data and grouped into a cluster. Clustering is the process of organizing a given set of objects into a set of disjoint groups called clusters. There are a number of clustering algorithms like k-means, k-medoids, normalized k-means, etc. So, the focus remains on efficiency and accuracy of algorithms. The focus is also on the time it takes for clustering and reducing overlapping between clusters. K-means is one of the simplest unsupervised learning algorithms that solves the well-known clustering problem. The k-means algorithm partitions data into K clusters and the centroids are randomly chosen resulting numeric values prohibits it from being used to cluster real world data containing categorical values. Poor selection of initial centroids can result in poor clustering. This article deals with a proposed algorithm which is a variant of k-means with some modifications resulting in better clustering, reduced overlapping and lesser time required for clustering by selecting initial centres in k-means and normalizing the data.

References

[1]

Ageev, A. A. 2002. Improved approximation algorithms for multilevel facility location problems. Operations Research Letters, 305, 327-332.

Digital Library

[2]

BatraA. 2011. Analysis and Approach: K-Means and K-Medoids Data Mining Algorithms. In Proceedings of the 5th IEEE International Conference on Advanced Computing & Communication Technologies {ICACCT-2011}.

[3]

Blömer, J., Brauer, S., & Bujna, K. 2015. Complexity and approximation of the fuzzy k-means problem. arXiv:1512.05947

[4]

Chang, C. T., Lai, J. Z., & Jeng, M. D. 2011. A fuzzy k-means clustering algorithm using cluster center displacement. Journal of Information Science and Engineering, 273, 995-1009.

[5]

Chaturvedi, E. N., & Rajavat, E. A. 2013. An improvement in K-mean clustering algorithm using better time and accuracy. International Journal of Programming Languages and Applications, 34, 13-19.

[6]

Dernoncourt, F. 2013, April 7. Frank Dernoncourt's Answer to What is the difference between K-Means and Fuzzy-C Means Clustering? Quora. Retrieved from https://www.quora.com/What-is-the-difference-between-K-Means-and-Fuzzy-C-Means-Clustering

[7]

Ghuli, P., Prabhakar, M., & Shettar, R. 2015. A Comprehensive Survey on Centroid Selection Strategies for Distributed K-means Clustering Algorithm. International Journal of Computers and Applications, 1255.

[8]

Guha, S., Rastogi, R., & Shim, K. 1998, June. CURE: An efficient clustering algorithm for large databases. SIGMOD Record, 272, 73-84.

Digital Library

[9]

Han, J., Kamber, M., & Pei, J. 2011. Data Mining: Concepts and Techniques 3rd ed. USA: Morgan Kaufmann.

Digital Library

[10]

Hastie, T., Tibshirani, R., & Friedman, J. 2009. Unsupervised learning. In The elements of statistical learning pp. 485-585. Springer New York.

[11]

Jain, A. K. 2010. Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 318, 651-666.

Digital Library

[12]

Kao, H. P., Harding, I. A., Karger, A., Oldham, M. F., Ostadan, O., & Young, G. 2012. U.S. Patent No. 8,089,623. Washington, DC: U.S. Patent and Trademark Office.

[13]

Li, D., Deogun, J., Spaulding, W., & Shuart, B. 2004. Towards missing data imputation: a study of fuzzy k-means clustering method. In Rough sets and current trends in computing pp. 573-579. Springer Berlin/Heidelberg.

[14]

Lichman, M. 2013. UCI Machine Learning Repository {http://archive.ics.uci.edu/ml}. Irvine, CA: University of California, School of Information and Computer Science.

[15]

NazeerK. A.SebastianM. P. 2009, July. Improving the Accuracy and Efficiency of the k-means Clustering Algorithm. In Proceedings of the World Congress on Engineering Vol. 1.

[16]

Panda, S., & Jana, P. 2015a. Efficient task scheduling algorithms for heterogeneous multi-cloud environment. The Journal of Supercomputing, Springer, 714, 1505-1533.

Digital Library

[17]

Panda, S., & Jana, P. 2015b. Uncertainty-based qos min-min algorithm for heterogeneous multi-cloud environment. The Arabian Journal for Science and Engineering, Springer, 418, 3003-3025.

[18]

Panda, S., & Jana, P. 2017. SLA-based task scheduling algorithms for heterogeneous multi-cloud environment. The Journal of Supercomputing, Springer, 736, 2730-2762.

Digital Library

[19]

Panda, S. K., & Jana, P. K. 2016. Normalization-based task scheduling algorithms for heterogeneous multi-cloud environment. Information Systems Frontiers.

[20]

Pradeepini, G., & Jyothi, S. 2013. An improved k-means clustering algorithm with refined initial centroids. Publications Of Problems & Application in Engineering Research-Paper, 41.

[21]

Pradhan, R., Panda, S. K., & Sathua, S. K. 2016. K-means Min-Min Scheduling Algorithm for Heterogeneous Grids or Clouds. International Journal of Information Processing, 9, 89-99.

[22]

Rai, P. 2011. Data Clustering: K-means and Hierarchical Clustering. CS5350/6350: Machine Learning Oct, 4, 24.

[23]

Singh, H., & Kaur, K. 2013. New Method for Finding Initial Cluster Centroids in K-means Algorithm. International Journal of Computers and Applications, 746.

[24]

ttnphns. 2012, January 17. ttnphns's Answer to Are mean normalization and feature scaling needed for k-means clustering? Stackexchange. Retrieved from https://stats.stackexchange.com/questions/21222/are-mean-normalization-and-feature-scaling-needed-for-k-means-clustering

[25]

Velmurugan, T., & Santhanam, T. 2010. Computational complexity between K-means and K-medoids clustering algorithms for normal and uniform distributions of data points. Journal of Computational Science, 63, 363-368.

[26]

Virmani, D., Taneja, S., & Malhotra, G. 2015. Normalization based K means Clustering Algorithm. arXiv:1503.00900

[27]

Wikipedia. n.d. k-means Clustering. Retrieved March 20, 2017, from https://en.wikipedia.org/wiki/K-means_clustering

[28]

Wikipedia. n.d. k-medoids. Retrieved March 20, 2017, from https://en.wikipedia.org/wiki/K-medoids

[29]

Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., & Zhou, Z. H. et al. 2008. Top 10 algorithms in data mining. Knowledge and Information Systems, 141, 1-37.

Digital Library

Proficient Normalised Fuzzy K-Means With Initial Centroids Methodology
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning

Recommendations

A novel method for selecting initial centroids in K-means clustering algorithm

In data mining, clustering is a method of grouping similar points together. This grouping can be done using partitioning or hierarchical clustering algorithms. K-means is one of the partitioning clustering algorithms which is simple and faster than ...
K-Means Centroids Initialization Based on Differentiation Between Instances Attributes
The conventional K-Means clustering algorithm is widely used for grouping similar data points by initially selecting random centroids. However, the accuracy of clustering results is significantly influenced by the initial centroid selection. Despite ...
Speedup of the k-Means Algorithm for Partitioning Large Datasets of Flat Points by a Preliminary Partition and Selecting Initial Centroids
Abstract
A problem of partitioning large datasets of flat points is considered. Known as the centroid-based clustering problem, it is mainly addressed by the k-means algorithm and its modifications. As the k-means performance becomes poorer on large ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of Knowledge Discovery in Bioinformatics

International Journal of Knowledge Discovery in Bioinformatics Volume 8, Issue 1

January 2018

105 pages

ISSN:1947-9115

EISSN:1947-9123

Issue’s Table of Contents

Publisher

IGI Global

United States

Publication History

Published: 01 January 2018

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents