Article

Probabilistic Approach for Embedding Arbitrary Features of Text

Author:

Analysis of Images, Social Networks and Texts: 7th International Conference, AIST 2018, Moscow, Russia, July 5–7, 2018, Revised Selected Papers

Pages 134 - 140

https://doi.org/10.1007/978-3-030-11027-7_14

Published: 31 December 2018 Publication History

Abstract

Topic modeling is usually used to model words in documents by probabilistic mixtures of topics. We generalize this setup and consider arbitrary features of the positions in a corpus, e.g. “contains a word”, “belongs to a sentence”, “has a word in the local context”, “is labeled with a POS-tag”, etc. We build sparse probabilistic embeddings for positions and derive embeddings for the features by averaging of those. Importantly, we interpret the EM-algorithm as an iterative process of intersection and averaging steps that reestimate position and feature embeddings respectively. With this approach, we get several insights. First, we argue that a sentence should not be represented as an average of its words. While each word is a mixture of multiple senses, each word occurrence refers typically to just one specific sense. So in our approach, we obtain sentence embeddings by averaging position embeddings from the E-step. Second, we show that Biterm Topic Model (Yan et al. [11]) and Word Network Topic Model (Zuo et al. [12]) are equivalent with the only difference of tying word and context embeddings. We further extend these models by adjusting representation of each sliding window with a few iterations of EM-algorithm. Finally, we aim at consistent embeddings for hierarchical entities, e.g. for word-sentence-document structure. We discuss two alternative schemes of training and generalize to the case where the middle level of the hierarchy is unknown. It provides a unified formulation for topic segmentation and word sense disambiguation tasks.

References

[1]

Arora, S., Li, Y., Liang, Y., Ma, T., Risteski, A.: Linear algebraic structure of word senses, with applications to polysemy. CoRR abs/1601.03764 (2016)

Google Scholar

[2]

Arora, S., Liang, Y., Ma, T.: A simple but tough-to-beat baseline for sentence embeddings. In: International Conference on Learning Representations (2017)

Google Scholar

[3]

Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of EMNLP, pp. 670–680. Association for Computational Linguistics (2017)

Google Scholar

[4]

Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, UAI 1999, pp. 289–296. Morgan Kaufmann Publishers Inc., San Francisco (1999)

Google Scholar

[5]

Inan, H., Khosravi, K., Socher, R.: Tying word vectors and word classifiers: A loss framework for language modeling. CoRR abs/1611.01462 (2016)

Google Scholar

[6]

Kiros, R., et al.: Skip-thought vectors. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS 2015, pp. 3294–3302. MIT Press, Cambridge (2015)

Google Scholar

[7]

Kochedykov, D., Apishev, M., Golitsyn, L., Vorontsov, K.: Fast and modular regularized topic modelling. In: Proceeding of the 21st Conference of FRUCT Association, ISMW, pp. 182–193 (2017)

Google Scholar

[8]

Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram features. In: Proceedings of NAACL (2018)

Google Scholar

[9]

Potapenko A, Popov A, and Vorontsov K Filchenkov A, Pivovarova L, and Žižka J Interpretable probabilistic embeddings: bridging the gap between topic models and neural networks Artificial Intelligence and Natural Language 2018 Cham Springer 167-180

Crossref

Google Scholar

[10]

Press, O., Wolf, L.: Using the output embedding to improve language models. In: Proceedings of ACL: Volume 2, Short Papers, pp. 157–163. ACL (2017)

Google Scholar

[11]

Yan, X., Guo, J., Lan, Y., Cheng, X.: A biterm topic model for short texts. In: Proceedings of WWW, pp. 1445–1456 (2013)

Google Scholar

[12]

Zuo Y, Zhao J, and Xu K Word network topic model: a simple but general solution for short and imbalanced texts Knowl. Inf. Syst. 2016 48 2 379-398

Digital Library

Google Scholar

Recommendations

Using Word Embedding to Evaluate the Coherence of Topics from Twitter Data
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Scholars often seek to understand topics discussed on Twitter using topic modelling approaches. Several coherence metrics have been proposed for evaluating the coherence of the topics generated by these approaches, including the pre-calculated Pointwise ...
WELDA: Enhancing Topic Models by Incorporating Local Word Context
JCDL '18: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries

The distributional hypothesis states that similar words tend to have similar contexts in which they occur. Word embedding models exploit this hypothesis by learning word vectors based on the local context of words. Probabilistic topic models on the ...
A Framework for Learning Cross-Lingual Word Embedding with Topics
Web and Big Data
Abstract
Cross-lingual word embeddings have been served as fundamental components for many Web-based applications. However, current models learn cross-lingual word embeddings based on projection of two pre-trained monolingual embeddings based on well-known ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

Analysis of Images, Social Networks and Texts: 7th International Conference, AIST 2018, Moscow, Russia, July 5–7, 2018, Revised Selected Papers

Jul 2018

357 pages

ISBN:978-3-030-11026-0

DOI:10.1007/978-3-030-11027-7

Editors:
Wil M. P. van der Aalst
RWTH Aachen University, Aachen, Germany
,
Vladimir Batagelj
University of Ljubljana, Ljubljana, Slovenia
,
Goran Glavaš
University of Mannheim, Mannheim, Germany
,
Dmitry I. Ignatov
National Research University Higher School of Economics, Moscow, Russia
,
Michael Khachay
Institute of Mathematics and Mechanics, Yekaterinburg, Russia
,
Sergei O. Kuznetsov
National Research University Higher School of Economics, Moscow, Russia
,
Olessia Koltsova
National Research University Higher School of Economics , Saint Petersburg, Russia
,
Irina A. Lomazova
National Research University Higher School of Economics, Moscow, Russia
,
Natalia Loukachevitch
Moscow State University, Moscow, Russia
,
Amedeo Napoli
Loria, Vandoeuvre lès Nancy, France
,
Alexander Panchenko
University of Hamburg, Hamburg, Germany
,
Panos M. Pardalos
University of Florida, Gainesville, FL, USA
,
Marcello Pelillo
Ca Foscari University of Venice, Venice, Italy
,
Andrey V. Savchenko
National Research University Higher School of Economics, Nizhny Novgorod, Russia

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 31 December 2018

Author Tags

Qualifiers

Article

Recommendations

Using Word Embedding to Evaluate the Coherence of Topics from Twitter Data

WELDA: Enhancing Topic Models by Incorporating Local Word Context

A Framework for Learning Cross-Lingual Word Embedding with Topics