[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Ensemble Method for Multi-view Text Clustering

  • Conference paper
  • First Online:
Computational Collective Intelligence (ICCCI 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11683))

Included in the following conference series:

Abstract

Textual data frequently occurs as an unlabeled document collection, therefore it is useful to sort this collection into clusters of related documents. On the other hand, text has different aspects, which a single representation cannot capture. To this end, multi-view clustering present an efficient solution to integrate different representations called “views” by exploiting the complementary characteristics of these views. However, the existing methods consider only one representation mode for all views that is based on terms frequencies. Such representation leads to losing valuable information and fails to capture the semantic aspect of text. To overcome these issues, we propose a new method for multi-view text clustering that exploits different representations of text in order to improve the quality of clustering. The experimental results show that the proposed method outperforms other methods and enhances the clustering quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 67.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 84.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Aggarwal, C.C., Zhai, C.: Mining Text Data. Springer, New York (2012). https://doi.org/10.1007/978-1-4614-3223-4

    Book  Google Scholar 

  2. Amini, M., Usunier, N., Goutte, C.: Learning from multiple partially observed views-an application to multilingual text categorization. In: Advances in Neural Information Processing Systems, pp. 28–36 (2009)

    Google Scholar 

  3. Ben N’Cir, C.E., Essoussi, N.: Using sequences of words for non-disjoint grouping of documents. Int. J. Pattern Recognit Artif Intell. 29(03), 1550013 (2015)

    Article  Google Scholar 

  4. Bickel, S., Scheffer, T.: Multi-view clustering. In: ICDM, vol. 4, pp. 19–26 (2004)

    Google Scholar 

  5. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  6. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100. ACM (1998)

    Google Scholar 

  7. Bolstad, W.M.: Understanding Computational Bayesian Statistics, vol. 644. Wiley, New York (2010)

    MATH  Google Scholar 

  8. Chao, G., Sun, S., Bi, J.: A survey on multi-view clustering. arXiv preprint arXiv:1712.06246 (2017)

  9. Ding, Z., Fu, Y.: Low-rank common subspace for multi-view learning. In: 2014 IEEE International Conference on Data Mining, pp. 110–119. IEEE (2014)

    Google Scholar 

  10. Fraj, M., Hajkacem, M.A.B., Essoussi, N.: A novel tweets clustering method using word embeddings. In: 2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA), pp. 1–7. IEEE (2018)

    Google Scholar 

  11. Guo, Y.: Convex subspace representation learning from multi-view data. In: AAAI, vol. 1, p. 2 (2013)

    Google Scholar 

  12. Hassan, M.T., Karim, A., Kim, J.B., Jeon, M.: CDIM: document clustering by discrimination information maximization. Inf. Sci. 316, 87–106 (2015)

    Article  Google Scholar 

  13. Hussain, S.F., Mushtaq, M., Halim, Z.: Multi-view document clustering via ensemble method. J. Intell. Inf. Syst. 43(1), 81–99 (2014)

    Article  Google Scholar 

  14. Jun, S., Park, S.S., Jang, D.S.: Document clustering method using dimension reduction and support vector clustering to overcome sparseness. Expert Syst. Appl. 41(7), 3204–3212 (2014)

    Article  Google Scholar 

  15. Kalogeratos, A., Likas, A.: Document clustering using synthetic cluster prototypes. Data Knowl. Eng. 70(3), 284–306 (2011)

    Article  Google Scholar 

  16. Kumar, A., Daumé, H.: A co-training approach for multi-view spectral clustering. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 393–400 (2011)

    Google Scholar 

  17. Kumar, V., Minz, S.: Multi-view ensemble learning: an optimal feature set partitioning for high-dimensional data classification. Knowl. Inf. Syst. 49(1), 1–59 (2016)

    Article  Google Scholar 

  18. Larsen, B., Aone, C.: Fast and effective text mining using linear-time document clustering. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 16–22. Citeseer (1999)

    Google Scholar 

  19. Liu, J., Wang, C., Gao, J., Han, J.: Multi-view clustering via joint nonnegative matrix factorization. In: Proceedings of the 2013 SIAM International Conference on Data Mining, pp. 252–260. SIAM (2013)

    Google Scholar 

  20. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  21. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  22. Nie, F., Cai, G., Li, X.: Multi-view clustering and semi-supervised classification with adaptive neighbours. In: AAAI, pp. 2408–2414 (2017)

    Google Scholar 

  23. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)

    Article  Google Scholar 

  24. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)

    Article  Google Scholar 

  25. Strehl, A., Ghosh, J.: Cluster ensembles-a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

  26. Sun, S.: A survey of multi-view machine learning. Neural Comput. Appl. 23(7–8), 2031–2038 (2013)

    Article  Google Scholar 

  27. Tagarelli, A., Karypis, G.: A segment-based approach to clustering multi-topic documents. Knowl. Inf. Syst. 34(3), 563–595 (2013)

    Article  Google Scholar 

  28. Tao, Z., Liu, H., Li, S., Ding, Z., Fu, Y.: From ensemble clustering to multi-view clustering. In: IJCAI (2017)

    Google Scholar 

  29. Wan, X.: Co-training for cross-lingual sentiment classification. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1, pp. 235–243. Association for Computational Linguistics (2009)

    Google Scholar 

  30. Wei, B., Pal, C.: Cross lingual adaptation: an experiment on sentiment classifications. In: Proceedings of the ACL 2010 Conference Short Papers, pp. 258–262. Association for Computational Linguistics (2010)

    Google Scholar 

  31. Xie, X., Sun, S.: Multi-view clustering ensembles. In: 2013 International Conference on Machine Learning and Cybernetics (ICMLC), vol. 1, pp. 51–56. IEEE (2013)

    Google Scholar 

  32. Xu, Z., Sun, S.: An algorithm on multi-view adaboost. In: Wong, K.W., Mendis, B.S.U., Bouzerdoum, A. (eds.) ICONIP 2010. LNCS, vol. 6443, pp. 355–362. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17537-4_44

    Chapter  Google Scholar 

  33. Yang, Y., Wang, H.: Multi-view clustering: a survey. Big Data Min. Anal. 1(2), 83–107 (2018)

    Article  Google Scholar 

  34. Yin, Q., Wu, S., He, R., Wang, L.: Multi-view clustering via pairwise sparse subspace representation. Neurocomputing 156, 12–21 (2015)

    Article  Google Scholar 

  35. Yin, Q., Wu, S., Wang, L.: Unified subspace learning for incomplete and unlabeled multi-view data. Pattern Recogn. 67, 313–327 (2017)

    Article  Google Scholar 

  36. Zhao, H., Ding, Z., Fu, Y.: Multi-view clustering via deep matrix factorization. In: AAAI, pp. 2921–2927 (2017)

    Google Scholar 

  37. Zhao, L., Chen, Z., Yang, Y., Wang, Z.J., Leung, V.C.: Incomplete multi-view clustering via deep semantic mapping. Neurocomputing 275, 1053–1062 (2018)

    Article  Google Scholar 

  38. Zhao, X., Evans, N., Dugelay, J.L.: A subspace co-training framework for multi-view clustering. Pattern Recogn. Lett. 41, 73–82 (2014)

    Article  Google Scholar 

  39. Zheng, L., Li, T., Ding, C.: Hierarchical ensemble clustering. In: 2010 IEEE International Conference on Data Mining, pp. 1199–1204. IEEE (2010)

    Google Scholar 

  40. Zhuang, F., Karypis, G., Ning, X., He, Q., Shi, Z.: Multi-view learning via probabilistic latent semantic analysis. Inf. Sci. 199, 20–30 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maha Fraj .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fraj, M., Ben Hajkacem, M.A., Essoussi, N. (2019). Ensemble Method for Multi-view Text Clustering. In: Nguyen, N., Chbeir, R., Exposito, E., Aniorté, P., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2019. Lecture Notes in Computer Science(), vol 11683. Springer, Cham. https://doi.org/10.1007/978-3-030-28377-3_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-28377-3_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-28376-6

  • Online ISBN: 978-3-030-28377-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics