[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3287921.3287927acmotherconferencesArticle/Chapter ViewAbstractPublication PagessoictConference Proceedingsconference-collections
research-article

A New Assessment of Cluster Tendency Ensemble approach for Data Clustering

Published: 06 December 2018 Publication History

Abstract

The ensemble is an universal machine learning method that is based on the divide-and-conquer principle. The ensemble aims to improve performance of system in terms of processing speed and quality. The assessment of cluster tendency is a method determining whether a considering data-set contains meaningful clusters. Recently, a silhouette-based assessment of cluster tendency method (SACT) has been proposed to simultaneously determine the appropriate number of data clusters and the prototypes. The advantages of SACT are accuracy and less the parameter, while there are limitations in data size and processing speed. In this paper, we proposed an improved SACT method for data clustering. We call eSACT algorithm. Experiments were conducted on synthetic data-sets and color image images. The proposed algorithm exhibited high performance, reliability and accuracy compared to previous proposed algorithms in the assessment of cluster tendency.

References

[1]
N. Duro S. Dormido-Canto J. Vega A. Mur, R. Dormido. 2016. Determination of the optimal number of clusters using a spectral clustering optimization. Expert Systems with Applications 65 (2016), 304--314.
[2]
W. Punch A. Topchy, A.K. Jain. 2004. A Mixture Model for Clustering Ensembles. Proc SIAM Intl Conf Data Mining (2004), 379--390.
[3]
J. Gama J. Stefanowski M. WoÅžniakB. Krawczyk, L.L. Minku. 2017. Ensemble learning for data stream analysis: A survey. Information Fusion 37 (2017), 132--156.
[4]
M. WoÅžniak H. Bustince F. Herrer B. Krawczyk, M. Galar. 2018. Dynamic ensemble selection for multi-class classification with one-class classifiers. Pattern Recognition 83 (2018), 34--51.
[5]
Wang X. Geng J. Bezdek K. Ramamohanarao C. Leckie, L. 2010. Enhanced Visual Analysis for Cluster Tendency Assessment and Data Partitioning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2010), 1401--1414.
[6]
SKM. Rao PV. Rao C. Subbalakshmi, GR. Krishna. 2015. A Method to Find Optimum Number of Clusters Based on Fuzzy Silhouette on Dynamic Data Set. Procedia Computer Science 46 (8 2015), 346--353.
[7]
SKM. Rao PV. Rao C. Subbalakshmi, GR. Krishna. 2015. A Method to Find Optimum Number of Clusters Based on Fuzzy Silhouette on Dynamic Data Set. Procedia Computer Science 46 (2015), 346--353.
[8]
C.D. Wang D. Huang, J. Lai. 2016. Ensemble clustering using factor graph. Pattern Recognition 50 (2016), 131--142.
[9]
Y. Yaslan G. Tuysuzoglua. 2018. Sparse coding based classifier ensembles in supervised and active learning scenarios for data classification. Expert Systems with Applications 91 (2018), 364--373.
[10]
Y. Zhou W. Zheng H. Ling, J. Wu. 2016. How many clusters. A robust PSO-based local density model. Neurocomputing 207 (2016), 264--275.
[11]
U. Maulik I. Saha, J.P. Sarkar. 2015. Ensemble based rough fuzzy clustering for categorical data. Knowledge-Based Systems 77 (2015), 114--127.
[12]
O. Kramer J. Heinermann. 2016. Machine learning ensembles for wind power prediction. Renewable Energy 89 (2016), 671--679.
[13]
R.J. Hathaway J.C. Bezdek. 2002. VAT: A Tool for Visual Assessment of (Cluster) Tendency. International Joint Conference on Neural Networks (2002), 2225--2230.
[14]
R. J. Hathaway J.M. Huband, J.C. Bezdek. 2005. bigVAT: Visual assessment of cluster tendency for large datasets. Pattern Recognition 38 (2005), 1875--1886.
[15]
M.C. Nicoletti J.R.B. Junior. 2019. An iterative boosting-based ensemble for streaming data classification. Information Fusion 45 (2019), 66--78.
[16]
R. Dhuli K.N.V. P.S. Rajesh. 2018. Classification of imbalanced ECG beats using re-sampling techniques and AdaBoost ensemble classifier. Biomedical Signal Processing and Control 41 (2018), 242--254.
[17]
J.C. Bezdek L. Wang, U.T.V. Nguyen. 2009. iVAT and aVAT. Enhanced Visual Analysis for Cluster Tendency Assessment. Advances in Knowledge Discovery and Data Mining 6118 (2009), 16--27.
[18]
K. Ramamohanarao J.C. Bezdek L. Wang, C. Leckie. 2009. Automatically Determining the Number of Clusters in Unlabelled data-sets. IEEE Transactions on Knowledge and Data Engineering 21, 3 (2009), 335--350.
[19]
B. Liu M. Han. 2015. Ensemble of extreme learning machine for remote sensing image classification. Neurocomputing 149 (2015), 65--70.
[20]
S.S. V.K. Madasu M. Hanmandlua, O.P. Verma. 2013. Color segmentation by fuzzy co-clustering of chrominance color features. Neurocomputing 120 (2013), 235--249.
[21]
J.R. Quevedo J.J. del Coz P. Perez-Gallego, A. Castano. 2019. Dynamic ensemble selection for quantification tasks. Information Fusion (2019), 1--15.
[22]
K. Duraiswamy P. Prabhu. 2012. Enhanced VAT for cluster quality assessment in unlabelled data-sets. Journal of Circuits, Systems and Computers 21, 1 (2012), 1250001--1250014.
[23]
Y. Peng. 2006. A novel ensemble machine learning for robust microarray data classification. Computers in Biology and Medicine 36, 6 (2006), 553--573.
[24]
C. Hennig RC. Amorim. 2015. Recovering the number of clusters in data-sets with noise features using feature rescaling factors. Information Sciences 324 (2015), 126--145.
[25]
J. Qiu J. Yao G. Wang G. Yu S. Huang, B. Wang. 2016. Parallel ensemble of online sequential extreme learning machine based on MapReduce. Neurocomputing 74 (2016), 352--367.
[26]
G. Zhang J. Zheng S. Sun, S. Wang. 2018. A decomposition-clustering-ensemble learning approach for solar radiation forecasting. Solar Energy 163 (2018), 189--199.
[27]
J.C. Bezdek T.C. Havens. 2012. An Efficient Formulation of the Improved Visual Assessment of Cluster Tendency (iVAT) Algorithm. IEEE Transactions on Knowledge and Data Engineering (2012), 813--822.
[28]
D.T. Nguyen L. T. Ngo V. N. Pham, L. T. Pham. 2018. A new cluster tendency assessment method for fuzzy co-clustering in hyperspectral image analysis. Neurocomputing 207 (2018), 213--226.
[29]
W. Pedrycz V. N. Pham, L. T. Ngo. 2016. Interval-valued fuzzy set approach to fuzzy co-clustering for data classification. Knowledge-Based Systems 107 (8 2016), 1--13.
[30]
L.I. Kunchev W.J. Faithfull, J.J. RodrÃŋguez. 2019. Combining univariate approaches for ensemble change detection in multivariate data. Information Fusion 45 (2019), 202--214.
[31]
J. Cao Y. Tian A. Alabdulkarim X. Wu, T. Ma. 2018. A comparative study of clustering ensemble algorithms. Computers & Electrical Engineering 68 (2018), 603--615.
[32]
B. Wang X. Li Y. Zhang, G. Cao. 2019. A novel ensemble method for k-nearest neighbor. Pattern Recognition 85 (2019), 13--25.
[33]
A.J. Trowsdale J. Tenner Y.Y. Yang, D.A. Linkeos. 2000. Ensemble neural network model for steel properties prediction. Metal Processing (2000), 401--406.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
SoICT '18: Proceedings of the 9th International Symposium on Information and Communication Technology
December 2018
496 pages
ISBN:9781450365390
DOI:10.1145/3287921
© 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

In-Cooperation

  • SOICT: School of Information and Communication Technology - HUST
  • NAFOSTED: The National Foundation for Science and Technology Development

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 December 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Clustering
  2. assessment of the cluster tendency
  3. ensemble
  4. number of clusters

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Vietnam National Foundation for Science and Technology Development (NAFOSTED)

Conference

SoICT 2018

Acceptance Rates

Overall Acceptance Rate 147 of 318 submissions, 46%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 83
    Total Downloads
  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media