An Attention Hierarchical Topic Modeling

Chunyan Yin¹,
Yongheng Chen² &
Wanli Zuo³

96 Accesses
Explore all metrics

Abstract

Probabilistic topic models have been used to detect topic-based content presentations when facing a collection of documents. However, topic models capture the semantic information according to reasonable simplifying hypotheses, which ignore the worthwhile word-order information. This paper proposes an attention hierarchical topic modeling, which adopts attention mechanism to unify topic embedding and word embedding together into a framework to enhance the clustering effect of hierarchical Dirichlet process. Otherwise, the multi-information integration Chinese restaurant franchise is adopted to construct this model, which further combines timestamp, user, and topic label to optimize topic modeling. Extensive experiments on real-life applications show that our model outperforms several strong baselines on document modeling and classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Conceptualization topic modeling

Article 07 September 2017

A Comprehensive Survey on Topic Modeling in Text Summarization

A novel topic model for documents by incorporating semantic relations between words

Article 23 December 2019

REFERENCES

D. Blei, A. Ng, and M. Jordan, “Latent Dirichlet allocation,” J. Mach. Learn. Res. 3, 993–1022 (2003).
MATH Google Scholar
J. Chang, S. Gerrish, C. Wang, J. L. Boyd-Graber, and D. M. Blei, “Reading tea leaves: How humans interpret topic models,” in Proc. Int. Conf. on Neural Information Processing Systems, Vancouver, 2009 (Curran Associates, Red Hook, N.Y., 2009), pp. 288–296.
Y. H. Chen, C. Y. Yin, and Y. J. Lin, “Multi-modal multi-layered topic classification model for social event analysis,” Multimedia Tools Appl. 77, 23291–23315 (2018). https://doi.org/10.1007/s11042-017-5588-7
Article Google Scholar
X. Cheng, X. Yan, Y. Lan, and J. Guo, “BTM: topic modeling over short texts,” IEEE Trans. Knowl. Data Eng. 26, 2928–2941 (2014). https://doi.org/10.1109/TKDE.2014.2313872
Article Google Scholar
N. Ghourchian, “Location-based activity recognition with hierarchical Dirichlet process,” in Proc. Twenty-Fifth Int. Joint Conf. on Artificial Intelligence, New York, 2016, Ed. by G. Brewka (AAAI Press, 2016), pp. 3990–3991.
G. Guo, J. Zhang, D. Thalmann, and N. Yorke-Smith, “Etaf: An extended trust antecedents framework for trust prediction,” in IEEE/ACM Int. Conf. on Advances in Social Networks Analysis and Mining (ASONAM 2014), Beijing, 2014 (IEEE, 2014), pp. 540–547. https://doi.org/10.1109/ASONAM.2014.6921639
D. Kim and A. Oh, “Accounting for data dependencies within a hierarchical Dirichlet process mixture model,” in Proc. 20th ACM Int. Conf. on Information and Knowledge Management, Glasgow, UK, 2011, Ed. by B. Berendt, A. de Vries, W. Fan, C. Macdonald, I. Ounis, and I. Ruthven (Association for Computing Machinery, New York, 2011), pp. 873–878. https://doi.org/10.1145/2063576.2063702
M. S. Kudinov and A. A. Romanenko, “A hybrid language model based on a recurrent neural network and probabilistic topic modeling,” Pattern Recognit. Image Anal. 26, 587–592 (2016). https://doi.org/10.1134/S1054661816030123
Article Google Scholar
B. Li, J. Zang, and J. Cao, “Efficient residual neural network for semantic segmentation,” Pattern Recognit. Image Anal. 31, 212–220 (2021). https://doi.org/10.1134/S1054661821020103
Article Google Scholar
L. Liu and M. Huang, “Biterm topic model with word vector features,” Appl. Res. Comput. 34, 2055–2058 (2017).
Google Scholar
P. Massa and P. Avesani, “Trust-aware recommender systems,” in Proc. ACM Conf. on Recommender Systems, Minneapolis, Minn., 2007 (Association for Computing Machinery, New York, 2007), pp. 17–24. https://doi.org/10.1145/1297231.1297235
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Proc. 26th Int. Conf. on Neural Information Processing Systems, Lake Tahoe, Nev., 2013, Ed. by C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Red Hook, N.Y., 2013), pp. 3111–3119.
D. Mimno, H. M. Wallach, E. Talley, M. Leenders, and A. McCallum, “Optimizing semantic coherence in topic models,” in Proc. Conf. on Empirical Methods in Natural Language Processing, Edinburgh, 2011, Ed. by R. Barzilay and M. Johnson (Association for Computational Linguistics, Edinburgh, 2011), pp. 262–272. https://aclanthology.org/D11-1024.
Google Scholar
L. Wang and X. Wang, “Hierarchical Dirichlet process model for gene expression clustering,” EURASIP J. Bioinf. Syst. Biol. 2013, 5 (2013). https://doi.org/10.1186/1687-4153-2013-5
Article Google Scholar
D. Newman, J. H. Lau, K. Grieser, and T. Baldwin, “Automatic evaluation of topic coherence,” in Human Language Technologies: The Annual Conf. of the North American Chapter of the Association for Computational Linguistics, Los Angeles, 2010, Ed. by R. Kaplan, J. Burstein, M. Harper, and G. Penn (Association for Computational Liguistics, Los Angeles, 2010), pp. 100–108. https://aclanthology.org/N10-1012.
Google Scholar
D. Q. Nguyen, R. Billingsley, L. Du, and M. Johnson, “Improving topic models with latent feature word representations,” Trans. Assoc. Comput. Linguist. 3, 299–313 (2015).
Article Google Scholar
K. Nigam, A. K. Mccallum, S. Thrun, and T. Mitchell, “Text classification from labeled and unlabeled documents using EM,” Mach. Learn. 39, 103–134 (2000). https://doi.org/10.1023/A:1007692713085
Article MATH Google Scholar
J. Pennington, R. Socher, C. D. Manning, “Glove: global vectors for word representation,” in Proc. Empirical Methods in Natural Language Processing (EMNLP), Doha, 2014, Ed. by A. Moschitti, B. Pang, and W. Daelemans (Association for Computational Linguistics, Doha, 2014), pp. 1532–1543. https://doi.org/10.3115/v1/D14-1162
Book Google Scholar
Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei, “Hierarchical Dirichlet processes,” J. Am. Stat. Assoc. 101, 1566–1581 (2006). https://doi.org/10.1198/016214506000000302
Article MathSciNet MATH Google Scholar
X. Yan, J. Guo, Y. Lan, and X. Cheng, “A biterm topic model for short texts,” in Proc. 22nd Int. Conf. on World Wide Web, Rio de Janeiro, 2013 (Association for Computing Machinery, New York, 2013), pp. 1445–1456. https://doi.org/10.1145/2488388.2488514

Download references

Funding

This work is supported by the National Natural Science Foundation of China, project no. 61672272; Key scientific research platform of Guangdong Provincial University, project nos. 2020ZDZX3033 and 2021ZDZX1030; Scientific and Technological Project of Zhanjiang, project nos. 2020B01272 and 2020B01252; Lingnan Normal University Scientific and Technological Project of YB2105; The project of human social science of Guangdong Provincial, project no. GD20XXW05.

Author information

Authors and Affiliations

Business School, Lingnan Normal University, 524048, Zhanjiang, China
Chunyan Yin
School of Information Engineering, Lingnan Normal University, 524048, Zhanjiang, China
Yongheng Chen
College of Computer Science and Technology, Jilin University, 130012, Changchun, China
Wanli Zuo

Authors

Chunyan Yin
View author publications
You can also search for this author in PubMed Google Scholar
Yongheng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wanli Zuo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chunyan Yin.

Ethics declarations

COMPLIANCE WITH ETHICAL STANDARDS

This article is a completely original work of its authors; it has not been published before and will not be sent to other publications until the PRIA Editorial Board decides not to accept it for publication.

Conflict of Interest

The process of writing and the content of the article does not give grounds for raising the issue of a conflict of interest.

Additional information

Chun-yan Yin. She obtained the BS degree from Harbin Normal University. She is an university lecturer at Business School, Lingnan Normal University. Main research area covers Database Theory, Machine Learning, Data Mining, and granular computing.

Yong-heng Chen. He received the PhD degree at the Department of Computer Science and technology, Jilin University in 2012. He is a Professor at School of Information Engineering, Lingnan Normal University from 2018. His current main research interests include Data Mining, Web Intelligence and Ontology Engineering, and Information integration. He is a member of System Software Committee of China’s Computer Federation. More than 20 papers of him were published in journals or international conferences

Wan-li Zuo. He is a Professor and doctoral supervisor at Department of Computer Science and Technology, Jilin University and s CCF senior member. Main research area covers Database Theory, Machine Learning, Data Mining and Web Mining, Web Search Engines, and Web Intelligence.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chunyan Yin, Chen, Y. & Zuo, W. An Attention Hierarchical Topic Modeling. Pattern Recognit. Image Anal. 31, 722–729 (2021). https://doi.org/10.1134/S1054661821040295

Download citation

Received: 30 January 2021
Revised: 08 July 2021
Accepted: 17 July 2021
Published: 27 December 2021
Issue Date: October 2021
DOI: https://doi.org/10.1134/S1054661821040295

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Conceptualization topic modeling

A Comprehensive Survey on Topic Modeling in Text Summarization

A novel topic model for documents by incorporating semantic relations between words

REFERENCES

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

COMPLIANCE WITH ETHICAL STANDARDS

Conflict of Interest

Additional information

Rights and permissions

About this article

Cite this article

Keywords:

Subscribe and save

Buy Now

An Attention Hierarchical Topic Modeling

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Conceptualization topic modeling

A Comprehensive Survey on Topic Modeling in Text Summarization

A novel topic model for documents by incorporating semantic relations between words

REFERENCES

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

COMPLIANCE WITH ETHICAL STANDARDS

Conflict of Interest

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords:

Subscribe and save

Buy Now