[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Proficient Normalised Fuzzy K-Means With Initial Centroids Methodology

Published: 01 January 2018 Publication History

Abstract

This article describes how data is relevant and if it can be organized, linked with other data and grouped into a cluster. Clustering is the process of organizing a given set of objects into a set of disjoint groups called clusters. There are a number of clustering algorithms like k-means, k-medoids, normalized k-means, etc. So, the focus remains on efficiency and accuracy of algorithms. The focus is also on the time it takes for clustering and reducing overlapping between clusters. K-means is one of the simplest unsupervised learning algorithms that solves the well-known clustering problem. The k-means algorithm partitions data into K clusters and the centroids are randomly chosen resulting numeric values prohibits it from being used to cluster real world data containing categorical values. Poor selection of initial centroids can result in poor clustering. This article deals with a proposed algorithm which is a variant of k-means with some modifications resulting in better clustering, reduced overlapping and lesser time required for clustering by selecting initial centres in k-means and normalizing the data.

References

[1]
Ageev, A. A. 2002. Improved approximation algorithms for multilevel facility location problems. Operations Research Letters, 305, 327-332.
[2]
BatraA. 2011. Analysis and Approach: K-Means and K-Medoids Data Mining Algorithms. In Proceedings of the 5th IEEE International Conference on Advanced Computing & Communication Technologies {ICACCT-2011}.
[3]
Blömer, J., Brauer, S., & Bujna, K. 2015. Complexity and approximation of the fuzzy k-means problem. arXiv:1512.05947
[4]
Chang, C. T., Lai, J. Z., & Jeng, M. D. 2011. A fuzzy k-means clustering algorithm using cluster center displacement. Journal of Information Science and Engineering, 273, 995-1009.
[5]
Chaturvedi, E. N., & Rajavat, E. A. 2013. An improvement in K-mean clustering algorithm using better time and accuracy. International Journal of Programming Languages and Applications, 34, 13-19.
[6]
Dernoncourt, F. 2013, April 7. Frank Dernoncourt's Answer to What is the difference between K-Means and Fuzzy-C Means Clustering? Quora. Retrieved from https://www.quora.com/What-is-the-difference-between-K-Means-and-Fuzzy-C-Means-Clustering
[7]
Ghuli, P., Prabhakar, M., & Shettar, R. 2015. A Comprehensive Survey on Centroid Selection Strategies for Distributed K-means Clustering Algorithm. International Journal of Computers and Applications, 1255.
[8]
Guha, S., Rastogi, R., & Shim, K. 1998, June. CURE: An efficient clustering algorithm for large databases. SIGMOD Record, 272, 73-84.
[9]
Han, J., Kamber, M., & Pei, J. 2011. Data Mining: Concepts and Techniques 3rd ed. USA: Morgan Kaufmann.
[10]
Hastie, T., Tibshirani, R., & Friedman, J. 2009. Unsupervised learning. In The elements of statistical learning pp. 485-585. Springer New York.
[11]
Jain, A. K. 2010. Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 318, 651-666.
[12]
Kao, H. P., Harding, I. A., Karger, A., Oldham, M. F., Ostadan, O., & Young, G. 2012. U.S. Patent No. 8,089,623. Washington, DC: U.S. Patent and Trademark Office.
[13]
Li, D., Deogun, J., Spaulding, W., & Shuart, B. 2004. Towards missing data imputation: a study of fuzzy k-means clustering method. In Rough sets and current trends in computing pp. 573-579. Springer Berlin/Heidelberg.
[14]
Lichman, M. 2013. UCI Machine Learning Repository {http://archive.ics.uci.edu/ml}. Irvine, CA: University of California, School of Information and Computer Science.
[15]
NazeerK. A.SebastianM. P. 2009, July. Improving the Accuracy and Efficiency of the k-means Clustering Algorithm. In Proceedings of the World Congress on Engineering Vol. 1.
[16]
Panda, S., & Jana, P. 2015a. Efficient task scheduling algorithms for heterogeneous multi-cloud environment. The Journal of Supercomputing, Springer, 714, 1505-1533.
[17]
Panda, S., & Jana, P. 2015b. Uncertainty-based qos min-min algorithm for heterogeneous multi-cloud environment. The Arabian Journal for Science and Engineering, Springer, 418, 3003-3025.
[18]
Panda, S., & Jana, P. 2017. SLA-based task scheduling algorithms for heterogeneous multi-cloud environment. The Journal of Supercomputing, Springer, 736, 2730-2762.
[19]
Panda, S. K., & Jana, P. K. 2016. Normalization-based task scheduling algorithms for heterogeneous multi-cloud environment. Information Systems Frontiers.
[20]
Pradeepini, G., & Jyothi, S. 2013. An improved k-means clustering algorithm with refined initial centroids. Publications Of Problems & Application in Engineering Research-Paper, 41.
[21]
Pradhan, R., Panda, S. K., & Sathua, S. K. 2016. K-means Min-Min Scheduling Algorithm for Heterogeneous Grids or Clouds. International Journal of Information Processing, 9, 89-99.
[22]
Rai, P. 2011. Data Clustering: K-means and Hierarchical Clustering. CS5350/6350: Machine Learning Oct, 4, 24.
[23]
Singh, H., & Kaur, K. 2013. New Method for Finding Initial Cluster Centroids in K-means Algorithm. International Journal of Computers and Applications, 746.
[24]
ttnphns. 2012, January 17. ttnphns's Answer to Are mean normalization and feature scaling needed for k-means clustering? Stackexchange. Retrieved from https://stats.stackexchange.com/questions/21222/are-mean-normalization-and-feature-scaling-needed-for-k-means-clustering
[25]
Velmurugan, T., & Santhanam, T. 2010. Computational complexity between K-means and K-medoids clustering algorithms for normal and uniform distributions of data points. Journal of Computational Science, 63, 363-368.
[26]
Virmani, D., Taneja, S., & Malhotra, G. 2015. Normalization based K means Clustering Algorithm. arXiv:1503.00900
[27]
Wikipedia. n.d. k-means Clustering. Retrieved March 20, 2017, from https://en.wikipedia.org/wiki/K-means_clustering
[28]
Wikipedia. n.d. k-medoids. Retrieved March 20, 2017, from https://en.wikipedia.org/wiki/K-medoids
[29]
Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., & Zhou, Z. H. et al. 2008. Top 10 algorithms in data mining. Knowledge and Information Systems, 141, 1-37.
  1. Proficient Normalised Fuzzy K-Means With Initial Centroids Methodology

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image International Journal of Knowledge Discovery in Bioinformatics
    International Journal of Knowledge Discovery in Bioinformatics  Volume 8, Issue 1
    January 2018
    105 pages
    ISSN:1947-9115
    EISSN:1947-9123
    Issue’s Table of Contents

    Publisher

    IGI Global

    United States

    Publication History

    Published: 01 January 2018

    Author Tags

    1. Cluster Analysis
    2. Clustering Algorithms
    3. Data Mining
    4. Fuzzy K-Means
    5. Initial Centroids
    6. K-Means
    7. K-Medoids
    8. Normalization
    9. Normalized K-Means
    10. Weighted Average

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Dec 2024

    Other Metrics

    Citations

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media