[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Improving topic disentanglement via contrastive learning

Published: 01 March 2023 Publication History

Abstract

With the emergence and development of deep generative models, such as the variational auto-encoders (VAEs), the research on topic modeling successfully extends to a new area: neural topic modeling, which aims to learn disentangled topics to understand the data better. However, the original VAE framework had been shown to be limited in disentanglement performance, bringing their inherent defects to a neural topic model (NTM). In this paper, we put forward that the optimization objectives of contrastive learning are consistent with two important goals (alignment and uniformity) of well-disentangled topic learning. Also, the optimization objectives of contrastive learning are consistent with two key evaluation measures for topic models, topic coherence and topic diversity. So, we come to the important conclusion that alignment and uniformity of disentangled topic learning can be quantified with topic coherence and topic diversity. Accordingly, we are inspired to propose the Contrastive Disentangled Neural Topic Model (CNTM). By representing both words and topics as low-dimensional vectors in the same embedding space, we apply contrastive learning to neural topic modeling to produce factorized and disentangled topics in an interpretable manner. We compare our proposed CNTM with strong baseline models on widely-used metrics. Our model achieves the best topic coherence scores under the most general evaluation setting (100% proportion topic selected) with 25.0%, 10.9%, 24.6%, and 51.3% improvements above the second-best models’ scores reported on four datasets of 20 Newsgroups, Web Snippets, Tag My News, and Reuters, respectively. Our method also gets the second-best topic diversity scores on the dataset of 20Newsgroups and Web Snippets. Our experimental results show that CNTM can effectively leverage the disentanglement ability from contrastive learning to solve the inherent defect of neural topic modeling and obtain better topic quality.

Highlights

We propose the contrastive disentangled neural topic model based on topic embedding.
We use an explainable way to introduce contrastive learning to neural topic modeling.
We interpret the contrastive learning objective with MI maximization theory.
Experimental results demonstrate our proposed model’s disentanglement effectiveness.

References

[1]
Aletras N., Stevenson M., Evaluating topic coherence using distributional semantics, in: Proceedings of the 10th international conference on computational semantics (IWCS 2013) – Long papers, Association for Computational Linguistics, Potsdam, Germany, 2013, pp. 13–22. Retrieved from https://aclanthology.org/W13-0102.
[2]
Bahrainian S.A., Jaggi M., Eickhoff C., Self-supervised neural topic modeling, in: Findings of the association for computational linguistics: EMNLP 2021, Association for Computational Linguistics, Punta Cana, Dominican Republic, 2021, pp. 3341–3350,. Retrieved from https://aclanthology.org/2021.findings-emnlp.284.
[3]
Bengio Y., Courville A., Vincent P., Representation learning: A review and new perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (8) (2013) 1798–1828,.
[4]
Blei D.M., Ng A.Y., Jordan M.I., Latent Dirichlet allocation, Journal of Machine Learning Research 3 (2003) 993–1022. Retrieved from http://jmlr.org/papers/v3/blei03a.html.
[5]
Bowman S.R., Vilnis L., Vinyals O., Dai A., Jozefowicz R., Bengio S., Generating sentences from a continuous space, in: Proceedings of the 20th SIGNLL conference on computational natural language learning, Association for Computational Linguistics, Berlin, Germany, 2016, pp. 10–21,. Retrieved from https://aclanthology.org/K16-1002.
[6]
Burgess, C. P., Higgins, I., Pal, A., Matthey, L., Watters, N., Desjardins, G., et al. (2017). Understanding disentangling in β-VAE. In NIPS 2017 Disentanglement Workshop.
[7]
Burkhardt S., Kramer S., Decoupling sparsity and smoothness in the Dirichlet variational autoencoder topic model, Journal of Machine Learning Research 20 (131) (2019) 1–27. Retrieved from http://jmlr.org/papers/v20/18-569.html.
[8]
Chen X., Duan Y., Houthooft R., Schulman J., Sutskever I., Abbeel P., InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets, Lee D., Sugiyama M., Luxburg U., Guyon I., Garnett R. (Eds.), Advances in neural information processing systems, Vol. 29, Curran Associates, Inc., 2016, Retrieved from https://proceedings.neurips.cc/paper/2016/file/7c9d0b1f96aebd7b5eca8c3edaa19ebb-Paper.pdf.
[9]
Chen T., Kornblith S., Norouzi M., Hinton G., A simple framework for contrastive learning of visual representations, in: III H.D., Singh A. (Eds.), Proceedings of the 37th international conference on machine learning, in: Proceedings of machine learning research, Vol. 119, PMLR, 2020, pp. 1597–1607. Retrieved from http://proceedings.mlr.press/v119/chen20j.htm.
[10]
Chen R.T.Q., Li X., Grosse R.B., Duvenaud D.K., Isolating sources of disentanglement in variational autoencoders, Bengio S., Wallach H., Larochelle H., Grauman K., Cesa-Bianchi N., Garnett R. (Eds.), Advances in neural information processing systems, Vol. 31, Curran Associates, Inc., 2018, Retrieved from https://proceedings.neurips.cc/paper/2018/file/1ee3dfcd8a0645a25a35977997223d22-Paper.pdf.
[11]
Costa G., Ortale R., Effective interrelation of Bayesian nonparametric document clustering and embedded-topic modeling, Knowledge-Based Systems 234 (2021),. Retrieved from https://www.sciencedirect.com/science/article/pii/S0950705121008534.
[12]
Cui P., Hu L., Liu Y., Enhancing extractive text summarization with topic-aware graph neural networks, in: Proceedings of the 28th international conference on computational linguistics, International Committee on Computational Linguistics, Barcelona, Spain (Online), 2020, pp. 5360–5371,. Retrieved from https://aclanthology.org/2020.coling-main.468.
[13]
Dieng A.B., Ruiz F.J.R., Blei D.M., Topic modeling in embedding spaces, Transactions of the Association for Computational Linguistics 8 (2020) 439–453,. Retrieved from https://www.aclweb.org/anthology/2020.tacl-1.29.
[14]
Ding R., Nallapati R., Xiang B., Coherence-aware neural topic modeling, in: Proceedings of the 2018 conference on empirical methods in natural language processing, Association for Computational Linguistics, Brussels, Belgium, 2018, pp. 830–836,. Retrieved from https://aclanthology.org/D18-1096.
[15]
Griffiths T.L., Steyvers M., Finding scientific topics, Proceedings of the National Academy of Sciences 101 (suppl 1) (2004) 5228–5235,. Retrieved from https://www.pnas.org/content/101/suppl_1/5228.
[16]
He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R. (2020). Momentum Contrast for Unsupervised Visual Representation Learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.
[17]
Heinrich G., A generic approach to topic models, Machine Learning and Knowledge Discovery in Databases (2009) 517–532,.
[18]
Henaff O., Data-efficient image recognition with contrastive predictive coding, in: III H.D., Singh A. (Eds.), Proceedings of the 37th international conference on machine learning, in: Proceedings of machine learning research, Vol. 119, PMLR, 2020, pp. 4182–4192. Retrieved from http://proceedings.mlr.press/v119/henaff20a.html.
[19]
Higgins, I., Matthey, L., Pal, A., Burgess, C. P., Glorot, X., Botvinick, M. M., .... Lerchner, A. (2017). beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. In Proceedings of international conference on learning representations.
[20]
Hjelm R.D., Fedorov A., Lavoie-Marchildon S., Grewal K., Bachman P., Trischler A., et al., Learning deep representations by mutual information estimation and maximization, in: ICLR, 2019, Retrieved from https://openreview.net/forum?id=Bklr3j0cKX.
[21]
Kingma D.P., Welling M., Auto-encoding variational Bayes, in: Proceedings of international conference on learning representations, 2014, Retrieved from https://openreview.net/forum?id=33X9fd2-9FyZd.
[22]
Kipf T., van der Pol E., Welling M., Contrastive learning of structured world models, in: International conference on learning representations, 2020, Retrieved from https://openreview.net/forum?id=H1gax6VtDB.
[23]
Lang, K. (1995). Newsweeder: Learning to filter netnews. In Proceedings of the twelfth international conference on machine learning (pp. 331–339).
[24]
Lau J.H., Newman D., Baldwin T., Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality, in: Proceedings of the 14th conference of the european chapter of the association for computational linguistics, Association for Computational Linguistics, Gothenburg, Sweden, 2014, pp. 530–539,. Retrieved from https://aclanthology.org/E14-1056.
[25]
Li W., Suzuki E., Adaptive and hybrid context-aware fine-grained word sense disambiguation in topic modeling based document representation, Information Processing & Management 58 (4) (2021),. Retrieved from https://www.sciencedirect.com/science/article/pii/S0306457321000911.
[26]
Liu L., Huang H., Gao Y., Zhang Y., Improving neural topic modeling via Sinkhorn divergence, Information Processing & Management 59 (3) (2022),. Retrieved from https://www.sciencedirect.com/science/article/pii/S0306457321003356.
[27]
Locatello F., Bauer S., Lucic M., Raetsch G., Gelly S., Schölkopf B., et al., Challenging common assumptions in the unsupervised learning of disentangled representations, in: Chaudhuri K., Salakhutdinov R. (Eds.), Proceedings of the 36th international conference on machine learning, in: Proceedings of machine learning research, 97, PMLR, 2019, pp. 4114–4124. Retrieved from https://proceedings.mlr.press/v97/locatello19a.html.
[28]
Logeswaran L., Lee H., An efficient framework for learning sentence representations, in: International conference on learning representations, 2018, Retrieved from https://openreview.net/forum?id=rJvJXZb0W.
[29]
van der Maaten L., Hinton G., Visualizing data using t-SNE, Journal of Machine Learning Research 9 (86) (2008) 2579–2605. Retrieved from http://jmlr.org/papers/v9/vandermaaten08a.html.
[30]
Mathieu E., Rainforth T., Siddharth N., Teh Y.W., Disentangling disentanglement in variational autoencoders, in: Chaudhuri K., Salakhutdinov R. (Eds.), Proceedings of the 36th international conference on machine learning, in: Proceedings of machine learning research, Vol. 97, PMLR, 2019, pp. 4402–4412. Retrieved from https://proceedings.mlr.press/v97/mathieu19a.html.
[31]
Mcauliffe J., Blei D., Supervised topic models, Platt J., Koller D., Singer Y., Roweis S. (Eds.), Advances in neural information processing systems, Vol. 20, Curran Associates, Inc., 2007, Retrieved from https://proceedings.neurips.cc/paper/2007/file/d56b9fc4b0f1be8871f5e1c40c0067e7-Paper.pdf.
[32]
Miao Y., Grefenstette E., Blunsom P., Discovering discrete latent topics with neural variational inference, in: Precup D., Teh Y.W. (Eds.), Proceedings of the 34th international conference on machine learning, in: Proceedings of machine learning research, Vol. 70, PMLR, 2017, pp. 2410–2419. Retrieved from http://proceedings.mlr.press/v70/miao17a.html.
[33]
Miao Y., Yu L., Blunsom P., Neural variational inference for text processing, in: Balcan M.F., Weinberger K.Q. (Eds.), Proceedings of the 33rd international conference on machine learning, in: Proceedings of machine learning research, Vol. 48, PMLR, New York, New York, USA, 2016, pp. 1727–1736. Retrieved from http://proceedings.mlr.press/v48/miao16.html.
[34]
Mikolov T., Sutskever I., Chen K., Corrado G.S., Dean J., Distributed representations of words and phrases and their compositionality, in: Advances in neural information processing systems, 2013, pp. 3111–3119. Retrieved from http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.
[35]
Mimno D., Wallach H., Talley E., Leenders M., McCallum A., Optimizing semantic coherence in topic models, in: Proceedings of the 2011 conference on empirical methods in natural language processing, Association for Computational Linguistics, Edinburgh, Scotland, UK., 2011, pp. 262–272. Retrieved from https://aclanthology.org/D11-1024.
[36]
Nan F., Ding R., Nallapati R., Xiang B., Topic modeling with Wasserstein autoencoders, in: Proceedings of the 57th annual meeting of the association for computational linguistics, Association for Computational Linguistics, Florence, Italy, 2019, pp. 6345–6381,. Retrieved from https://aclanthology.org/P19-1640.
[37]
Nguyen T., Luu A.T., Contrastive learning for neural topic model, in: Ranzato M., Beygelzimer A., Dauphin Y., Liang P., Vaughan J.W. (Eds.), Advances in neural information processing systems, 34, Curran Associates, Inc., 2021, pp. 11974–11986. Retrieved from https://proceedings.neurips.cc/paper/2021/file/6467c327eaf8940b4dd07a08c63c5e85-Paper.pdf.
[38]
Phan X.-H., Nguyen L.-M., Horiguchi S., Learning to classify short and sparse text & web with hidden topics from large-scale data collections, in: WWW ’08: Proceeding of the 17th international conference on world wide web, ACM, New York, NY, USA, 2008, pp. 91–100,.
[39]
Rezende D.J., Mohamed S., Wierstra D., Stochastic backpropagation and approximate inference in deep generative models, in: Xing E.P., Jebara T. (Eds.), Proceedings of the 31st international conference on machine learning, in: Proceedings of machine learning research, Vol. 32, PMLR, Bejing, China, 2014, pp. 1278–1286. Retrieved from https://proceedings.mlr.press/v32/rezende14.html.
[40]
Röder M., Both A., Hinneburg A., Exploring the space of topic coherence measures, in: WSDM 2015 - Proceedings of the 8th ACM international conference on web search and data mining, 2015, pp. 399–408,.
[41]
Sermanet, P., Lynch, C., Chebotar, Y., Hsu, J., Jang, E., Schaal, S., .... Brain, G. (2018). Time-contrastive networks: Self-supervised learning from video. In 2018 IEEE international conference on robotics and automation.
[42]
Srivastava A., Sutton C., Autoencoding variational inference for topic models, in: Proceedings of international conference on learning representations, 2017, Retrieved from https://openreview.net/forum?id=BybtVK9lg.
[43]
Vitale D., Ferragina P., Scaiella U., Classification of short texts by deploying topical annotations, in: Baeza-Yates R., de Vries A.P., Zaragoza H., Cambazoglu B.B., Murdock V., Lempel R., Silvestri F. (Eds.), ECIR, in: Lecture notes in computer science, Vol. 7224, Springer, 2012, pp. 376–387. Retrieved from http://dblp.uni-trier.de/db/conf/ecir/ecir2012.html#VitaleFS12.
[44]
Řehůřek R., Sojka P., Software framework for topic modelling with large corpora, in: Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks, ELRA, Valletta, Malta, 2010, pp. 45–50. Retrieved from http://is.muni.cz/publication/884893/en.
[45]
Wang C., Blei D., Decoupling sparsity and smoothness in the discrete hierarchical Dirichlet process, Bengio Y., Schuurmans D., Lafferty J., Williams C., Culotta A. (Eds.), Advances in neural information processing systems, Vol. 22, Curran Associates, Inc., 2009, Retrieved from https://proceedings.neurips.cc/paper/2009/file/3b8a614226a953a8cd9526fca6fe9ba5-Paper.pdf.
[46]
Wang W., Gan Z., Xu H., Zhang R., Wang G., Shen D., …., Carin L., Topic-guided variational auto-encoder for text generation, in: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics, in: Human language technologies, Vol. 1, Association for Computational Linguistics, Minneapolis, Minnesota, 2019, pp. 166–177,. Retrieved from https://aclanthology.org/N19-1015.
[47]
Wang R., Hu X., Zhou D., He Y., Xiong Y., Ye C., et al., Neural topic modeling with bidirectional adversarial training, in: Proceedings of the 58th annual meeting of the association for computational linguistics, Association for Computational Linguistics, Online, 2020, pp. 340–350,. Retrieved from https://aclanthology.org/2020.acl-main.32.
[48]
Wang T., Isola P., Understanding contrastive representation learning through alignment and uniformity on the hypersphere, in: III H.D., Singh A. (Eds.), Proceedings of the 37th International Conference on Machine Learning, in: Proceedings of Machine Learning Research, 119, PMLR, 2020, pp. 9929–9939.
[49]
Wang, F. Liu, H. (2021). Understanding the Behaviour of Contrastive Loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2495–2504).
[50]
Wang X., Yang Y.I., Neural topic model with attention for supervised learning, in: Chiappa S., Calandra R. (Eds.), Proceedings of the twenty third international conference on artificial intelligence and statistics, in: Proceedings of machine learning research, Vol. 108, PMLR, 2020, pp. 1147–1156. Retrieved from https://proceedings.mlr.press/v108/wang20c.html.
[51]
Wang R., Zhou D., He Y., ATM: Adversarial-neural topic model, Information Processing & Management 56 (6) (2019),. Retrieved from https://www.sciencedirect.com/science/article/pii/S0306457319300500.
[52]
Wu X., Li C., Zhu Y., Miao Y., Short text topic modeling with topic distribution quantization and negative sampling decoder, in: Proceedings of the 2020 conference on empirical methods in natural language processing, Association for Computational Linguistics, Online, 2020, pp. 1772–1782,. Retrieved from https://aclanthology.org/2020.emnlp-main.138.
[53]
Wu, Z., Xiong, Y., Yu, S. X., & Lin, D. (2018). Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3733–3742).
[54]
Xie Q., Tiwari P., Gupta D., Huang J., Peng M., Neural variational sparse topic model for sparse explainable text representation, Information Processing & Management 58 (5) (2021),. Retrieved from https://www.sciencedirect.com/science/article/pii/S0306457321001102.
[55]
Zhao H., Phung D., Huynh V., Jin Y., Du L., Buntine W., Topic modelling meets deep neural networks: A survey, in: Zhou Z.-H. (Ed.), Proceedings of the thirtieth international joint conference on artificial intelligence, International Joint Conferences on Artificial Intelligence Organization, 2021, pp. 4713–4720,. Survey Track.
[56]
Zhao H., Phung D., Huynh V., Le T., Buntine W., Neural topic model via optimal transport, in: International conference on learning representations, 2020, Retrieved from https://openreview.net/forum?id=Oos98K9Lv-k.
[57]
Zhao X., Wang D., Zhao Z., Liu W., Lu C., Zhuang F., A neural topic model with word vectors and entity vectors for short texts, Information Processing & Management 58 (2) (2021),. Retrieved from https://www.sciencedirect.com/science/article/pii/S030645732030947X.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Processing and Management: an International Journal
Information Processing and Management: an International Journal  Volume 60, Issue 2
Mar 2023
1443 pages

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 March 2023

Author Tags

  1. Topic model
  2. Contrastive learning
  3. Disentanglement

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media