[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content

Advertisement

Log in

Ambiguity-driven fuzzy C-means clustering: how to detect uncertain clustered records

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

As a well-known clustering algorithm, Fuzzy C-Means (FCM) allows each input sample to belong to more than one cluster, providing more flexibility than non-fuzzy clustering methods. However, the accuracy of FCM is subject to false detections caused by noisy records, weak feature selection and low certainty of the algorithm in some cases. The false detections are very important in some decision-making application domains like network security and medical diagnosis, where weak decisions based on such false detections may lead to catastrophic outcomes. They mainly emerge from making decisions about a subset of records that do not provide sufficient evidence to make a good decision. In this paper, we propose a method for detecting such ambiguous records in FCM by introducing a certainty factor to decrease invalid detections. This approach enables us to send the detected ambiguous records to another discrimination method for a deeper investigation, thus increasing the accuracy by lowering the error rate. Most of the records are still processed quickly and with low error rate preventing performance loss which is common in similar hybrid methods. Experimental results of applying the proposed method on several datasets from different domains show a significant decrease in error rate as well as improved sensitivity of the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Bezdek JC, Ehrlich R, Full W (1984) Fcm: The fuzzy c-means clustering algorithm. Comput Geosci 10 (2):191–203

    Article  Google Scholar 

  2. Brush AJ, Krumm J, Scott J (2010) Exploring end user preferences for location obfuscation, location-based services, and the value of location. In: Proceedings of the 12th ACM international conference on Ubiquitous computing, pp 95–104. ACM

  3. Callado A, Kamienski C, Szabó G, Gero B, Kelner J, Fernandes S, Sadok D (2009) A survey on internet traffic identification. IEEE Communications Surveys & Tutorials 11(3):37–52

    Article  Google Scholar 

  4. Callado A, Kelner J, Sadok D, Kamienski C A, Fernandes S (2010) Better network traffic identification through the independent combination of techniques. J Netw Comput Appl 33(4):433–446

    Article  Google Scholar 

  5. Casas-Roma J, Herrera-Joancomartí J, Torra V (2014) Anonymizing graphs: measuring quality for clustering. Knowl Inf Syst:1–22

  6. Chuang K-S, Tzeng H-L, Chen S, Wu J, Chen T-J (2006) Fuzzy c-means clustering with spatial information for image segmentation. Comput Med Imaging Graph 30(1):9–15

    Article  Google Scholar 

  7. Dainotti A, Pescape A, Claffy KC (2012) Issues and future directions in traffic classification. IEEE Netw 26(1):35–40

    Article  Google Scholar 

  8. Endo Y, Hasegawa Y, Yukihiro H, Kanzawa Y (2011) Fuzzy c-means clustering for uncertain data using quadratic penalty-vector regularization. Journal of Advanced Computational Intelligence 15(1)

  9. Fonseca J, Abdelouahab Z, Lopes D, Labidi S (2010) A security framework for soa applications in mobile environment. arXiv:1004.0774

  10. Ghadiri A, Ghadiri N (2011) An adaptive hybrid architecture for intrusion detection based on fuzzy clustering and rbf neural networks. In: Communication Networks and Services Research Conference (CNSR), 2011 Ninth Annual, pp 123–129. IEEE

  11. Graves D, Pedrycz W (2010) Kernel-based fuzzy clustering and fuzzy clustering: A comparative experimental study. Fuzzy Sets Syst 161(4):522–543

    Article  MathSciNet  Google Scholar 

  12. Hamasuna Y, Endo Y, Miyamoto S (2011) On mahalanobis distance based fuzzy c-means clustering for uncertain data using penalty vector regularization. In: 2011 IEEE International Conference on Fuzzy Systems (FUZZ), pp 810–815. IEEE

  13. Hartigan JA, Wong MA (1979) Algorithm as 136: A k-means clustering algorithm. Appl Stat:100–108

  14. Hoh B, Gruteser M (2005) Protecting location privacy through path confusion. In: First International Conference on Security and Privacy for Emerging Areas in Communications Networks, 2005. SecureComm 2005, pp 194–205. IEEE

  15. Hoh B, Gruteser M, Xiong H, Alrabady A (2006) Enhancing security and privacy in traffic-monitoring systems. IEEE Pervasive Computing 5(4):38–46

    Article  Google Scholar 

  16. Höppner F, Klawonn F (2003) Improved fuzzy partitions for fuzzy regression models. Int J Approx Reason 32(2):85–102

    Article  MathSciNet  MATH  Google Scholar 

  17. Jain A, Agrawal S, Agrawal J, F-fdrpso Sanjeev Sharma. (2014) A novel approach based on hybridization of fuzzy c-means and fdrpso for gene clustering. In: Proceedings of the Third International Conference on Soft Computing for Problem Solving, pp 709–719. Springer

  18. Jiang W, Yao M, Yan J (2008) Intrusion detection based on improved fuzzy c-means algorithm. In: International Symposium on Information Science and Engineering, 2008. ISISE’08, vol 2, pp 326–329. IEEE

  19. Jianliang M, Haikun S, Ling B (2009) The application on intrusion detection based on k-means cluster algorithm. In: International Forum on Information Technology and Applications, 2009. IFITA’09, vol 1, pp 150–152. IEEE

  20. Li D-C, Liu C-W, Susan CH (2010) A learning method for the class imbalance problem with medical data sets. Comput Biol Med 40(5):509–518

    Article  Google Scholar 

  21. Li H, Cai J, Nguyen TNA, Zheng J (2013) A benchmark for semantic image segmentation. In: 2013 IEEE International Conference on Multimedia and Expo (ICME), pp 1–6. IEEE

  22. Li W, Canini M, Moore AW, Bolla R (2009) Efficient application identification and the temporal and spatial stability of classification schema. Comput Netw 53(6):790–809

    Article  MATH  Google Scholar 

  23. Lim Y-s, Kim H-c, Jeong J, Kim C-k, Kwon TT, Choi Y (2010) Internet traffic classification demystified: on the sources of the discriminative power. In: Proceedings of the 6th International COnference, p 9. ACM

  24. Lin K-P (2014) A novel evolutionary kernel intuitionistic fuzzy c-means clustering algorithm. IEEE Trans Fuzzy Syst 22(5):1074–1087

    Article  Google Scholar 

  25. Linda O, Manic M (2012) General type-2 fuzzy c-means algorithm for uncertain fuzzy clustering. IEEE Trans Fuzzy Syst 20(5):883–897

    Article  Google Scholar 

  26. Octavio L-G, García-Borroto M, Medina-Pérez MA, Martínez-Trinidad JF, Carrasco-Ochoa JA, De Ita G (2013) An empirical study of oversampling and undersampling methods for lcmine an emerging pattern based classifier. In: Pattern Recognition, pp 264–273. Springer

  27. Mei J-P, Linkfcm LC (2013) Relation integrated fuzzy c-means. Pattern Recog 46(1):272–283

    Article  Google Scholar 

  28. Ménard M, Demko C, Loonis P (2000) The fuzzy c + 2-means: solving the ambiguity rejection in clustering. Pattern recog 33(7):1219–1237

    Article  Google Scholar 

  29. Mohd AB, Nor SbM (2009) Towards a flow-based internet traffic classification for bandwidth optimization. Int J Comput Sci Secur (IJCSS) 3(2):146–153

    Google Scholar 

  30. Nejad TR, Abadi MSA (2014) Intrusion detection in computer networks through a hybrid approach of data mining and decision trees

  31. Pal NR, Pal K, Keller JM, Bezdek JC (2005) A possibilistic fuzzy c-means clustering algorithm. IEEE Trans Fuzzy Syst 13(4):517–530

    Article  MathSciNet  Google Scholar 

  32. Parker JK, Hall LO (2014) Accelerating fuzzy-c means using an estimated subsample size. IEEE Trans Fuzzy Syst 22(5):1229–1244

    Article  Google Scholar 

  33. Pedrycz W, Rai P (2008) Collaborative clustering with the use of fuzzy c-means and its quantification. Fuzzy Sets Syst 159(18):2399–2427

    Article  MathSciNet  MATH  Google Scholar 

  34. Sezer EA, Nefeslioglu HA, Gokceoglu C (2014) An assessment on producing synthetic samples by fuzzy c-means for limited number of data in prediction models. Appl Soft Comput 24:126–134

    Article  Google Scholar 

  35. Chao-Ton S, Chen L-S, Yih Y (2006) Knowledge acquisition through information granulation for imbalanced data. Expert Syst Appl 31(3):531–541

    Article  Google Scholar 

  36. Velmurugan T (2014) Performance based analysis between k-means and fuzzy c-means clustering algorithms for connection oriented telecommunication data. Appl Soft Comput 19:134– 146

    Article  Google Scholar 

  37. Wang X-Y, Juan B (2010) A fast and robust image segmentation using fcm with spatial information. Digital Signal Processing 20(4):1173–1182

    Article  Google Scholar 

  38. Williams N, Zander S, Armitage G (2006) A preliminary performance comparison of five machine learning algorithms for practical ip traffic flow classification. ACM SIGCOMM Computer Communication Review 36(5):5–16

    Article  Google Scholar 

  39. Yasunori E, Isao T, Yukihiro H, Sadaaki M (2011) Kernelized fuzzy c-means clustering for uncertain data using quadratic penalty-vector regularization with explicit mappings. In: 2011 IEEE International Conference on Fuzzy Systems (FUZZ), pp 804–809. IEEE

  40. Yu P, Qinghua L, Xiyuan P (2011) Uck-means: A customized k-means for clustering uncertain measurement data. In: 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), vol 2, pp 1196–1200. IEEE

  41. Yuan R, Li Z, Guan X, Li X (2010) An svm-based machine learning method for accurate internet traffic classification. Inf Syst Front 12(2):149–156

    Article  Google Scholar 

  42. Zeng S, Tong X, Sang N (2014) Study on multi-center fuzzy c-means algorithm based on transitive closure and spectral clustering. Appl Soft Comput 16:89–101

    Article  Google Scholar 

  43. Zhao F, Liu H, Fan J (2015) A multiobjective spatial fuzzy clustering algorithm for image segmentation. Appl Soft Comput 30:48–57

    Article  Google Scholar 

  44. Zhen L, Qiong L (2012) A new feature selection method for internet traffic classification using ml. Phys Procedia 33:1338–1345

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nasser Ghadiri.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghaffari, M., Ghadiri, N. Ambiguity-driven fuzzy C-means clustering: how to detect uncertain clustered records. Appl Intell 45, 293–304 (2016). https://doi.org/10.1007/s10489-016-0759-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-016-0759-1

Keywords

Navigation