More Web Proxy on the site http://driver.im/

research-article

Contextual Text Understanding in Distributional Semantic Space

Authors:

Jianpeng Cheng,

Zhongyuan Wang,

Zheng ChenAuthors Info & Claims

CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

Pages 133 - 142

https://doi.org/10.1145/2806416.2806517

Published: 17 October 2015 Publication History

Abstract

Representing discrete words in a continuous vector space turns out to be useful for natural language applications related to text understanding. Meanwhile, it poses extensive challenges, one of which is due to the polysemous nature of human language. A common solution (a.k.a word sense induction) is to separate each word into multiple senses and create a representation for each sense respectively. However, this approach is usually computationally expensive and prone to data sparsity, since each sense needs to be managed discriminatively. In this work, we propose a new framework for generating context-aware text representations without diving into the sense space. We model the concept space shared among senses, resulting in a framework that is efficient in both computation and storage. Specifically, the framework we propose is one that: i) projects both words and concepts into the same vector space; ii) obtains unambiguous word representations that not only preserve the uniqueness among words, but also reflect their context-appropriate meanings. We demonstrate the effectiveness of the framework in a number of tasks on text understanding, including word/phrase similarity measurements, paraphrase identification and question-answer relatedness classification.

References

[1]

S. Bartunov, D. Kondrashkin, A. Osokin, and D. Vetrov. Breaking sticks and ambiguities with adaptive skip-gram. arXiv preprint arXiv:1502.07257, 2015.

[2]

Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin. A neural probabilistic language model. The Journal of Machine Learning Research, 3:1137--1155, 2003.

Digital Library

[3]

J. Bian, B. Gao, and T.-Y. Liu. Knowledge-powered deep learning for word embedding. In Machine Learning and Knowledge Discovery in Databases, pages 132--148. Springer, 2014.

Digital Library

[4]

W. Blacoe and M. Lapata. A comparison of vector-based representations for semantic composition. In EMNLP 2012, pages 546--556. Association for Computational Linguistics, 2012.

Digital Library

[5]

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003.

Digital Library

[6]

K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD 2008, pages 1247--1250. ACM, 2008.

Digital Library

[7]

A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems, pages 2787--2795, 2013.

Digital Library

[8]

A. Bordes, J. Weston, R. Collobert, and Y. Bengio. Learning structured embeddings of knowledge bases. In Conference on Artificial Intelligence, number EPFL-CONF-192344, 2011.

Digital Library

[9]

C. Chelba, T. Mikolov, M. Schuster, Q. Ge, T. Brants, P. Koehn, and T. Robinson. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005, 2013.

[10]

R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, pages 160--167. ACM, 2008.

Digital Library

[11]

B. Dolan, C. Quirk, and C. Brockett. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In Proceedings of the 20th international conference on Computational Linguistics, page 350. Association for Computational Linguistics, 2004.

Digital Library

[12]

M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences, 95(25):14863--14868, 1998.

[13]

R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. Liblinear: A library for large linear classification. The Journal of Machine Learning Research, 9:1871--1874, 2008.

Digital Library

[14]

M. Faruqui, J. Dodge, S. K. Jauhar, C. Dyer, E. Hovy, and N. A. Smith. Retrofitting word vectors to semantic lexicons. arXiv preprint arXiv:1411.4166, 2014.

[15]

Z. S. Harris. Distributional structure. Word, 1954.

[16]

S. Hassan. Measuring semantic relatedness using salient encyclopedic concepts. University of North Texas, 2011.

[17]

W. Hua, Z. Wang, H. Wang, K. Zheng, and X. Zhou. Short text understanding through lexical-semantic analysis. In International Conference on Data Engineering (ICDE), 2015.

[18]

E. H. Huang, R. Socher, C. D. Manning, and A. Y. Ng. Improving word representations via global context and multiple word prototypes. In Proceedings of the 50th ACL. Association for Computational Linguistics, 2012.

Digital Library

[19]

N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188, 2014.

[20]

D. Kartsaklis, M. Sadrzadeh, et al. Prior disambiguation of word tensors for constructing sentence vectors. In EMNLP, pages 1590--1601, 2013.

[21]

D. Kartsaklis, M. Sadrzadeh, and S. Pulman. Separating disambiguation from composition in distributional semantics. In Proceedings of CoNLL, pages 114--123, 2013.

[22]

T. Lee, Z. Wang, H. Wang, and S.-w. Hwang. Attribute extraction and scoring: A probabilistic approach. 2013.

[23]

D. B. Lenat. Cyc: A large-scale investment in knowledge infrastructure. Communications of the ACM, 38(11):33--38, 1995.

Digital Library

[24]

J. Li and D. Jurafsky. Do multi-sense embeddings improve natural language understanding? arXiv preprint arXiv:1506.01070, 2015.

[25]

P. Li, H. Wang, K. Q. Zhu, Z. Wang, and X. Wu. Computing term similarity by large probabilistic isa knowledge. In CIKM, 2013.

Digital Library

[26]

N. Madnani, J. Tetreault, and M. Chodorow. Re-examining machine translation metrics for paraphrase identification. In Proceedings of NAACL, pages 182--190. Association for Computational Linguistics, 2012.

Digital Library

[27]

R. Mihalcea, C. Corley, and C. Strapparava. Corpus-based and knowledge-based measures of text semantic similarity. In AAAI, volume 6, pages 775--780, 2006.

Digital Library

[28]

T. Mikolov, Q. V. Le, and I. Sutskever. Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168, 2013.

[29]

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, 2013.

Digital Library

[30]

T. Mikolov, W.-t. Yih, and G. Zweig. Linguistic regularities in continuous space word representations. 2013.

[31]

J. Mitchell and M. Lapata. Vector-based models of semantic composition. In ACL, pages 236--244, 2008.

[32]

J. Mitchell and M. Lapata. Vector-based models of semantic composition. In Proceedings of ACL-08: HLT, pages 236--244, Columbus, Ohio, June 2008. Association for Computational Linguistics.

[33]

A. Mnih and Y. W. Teh. A fast and simple algorithm for training neural probabilistic language models. arXiv preprint arXiv:1206.6426, 2012.

Digital Library

[34]

A. Neelakantan, J. Shankar, A. Passos, and A. McCallum. Efficient nonparametric estimation of multiple embeddings per word in vector space. In Proceedings of EMNLP, 2014.

[35]

L. N. Pina and R. Johansson. A simple and efficient method to generate word sense representations. arXiv preprint arXiv:1412.6045, 2014.

[36]

D. Ramage, E. Rosen, J. Chuang, C. D. Manning, and D. A. McFarland. Topic modeling for the social sciences. In Workshop on Applications for Topic Models, NIPS, 2009.

[37]

J. Reisinger and R. J. Mooney. Multi-prototype vector-space models of word meaning. In NAACL 2010, pages 109--117. Association for Computational Linguistics, 2010.

Digital Library

[38]

H. Schütze. Automatic word sense discrimination. Computational linguistics, 24(1):97--123, 1998.

Digital Library

[39]

R. Socher, D. Chen, C. D. Manning, and A. Ng. Reasoning with neural tensor networks for knowledge base completion. In Advances in Neural Information Processing Systems, pages 926--934, 2013.

Digital Library

[40]

R. Socher, E. H. Huang, J. Pennin, C. D. Manning, and A. Y. Ng. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in Neural Information Processing Systems, pages 801--809, 2011.

Digital Library

[41]

R. Socher, C. C. Lin, C. Manning, and A. Y. Ng. Parsing natural scenes and natural language with recursive neural networks. In Proceedings of (ICML-11), pages 129--136, 2011.

[42]

Y. Song, H. Wang, Z. Wang, H. Li, and W. Chen. Short text conceptualization using a probabilistic knowledgebase. In Proceedings of the Twenty-Second international joint conference on Artificial Intelligence-Volume Volume Three, pages 2330--2336. AAAI Press, 2011.

Digital Library

[43]

F. Wang, Z. Wang, Z. Li, and J.-R. Wen. Concept-based short text classification and ranking. In ACM International Conference on Information and Knowledge Management (CIKM), October 2014.

Digital Library

[44]

Z. Wang, H. Wang, and Z. Hu. Head, modifier, and constraint detection in short texts. In IEEE 30th International Conference on Data Engineering (ICDE). IEEE, 2014.

[45]

Z. Wang, K. Zhao, H. Wang, X. Meng, and J.-R. Wen. Query understanding through knowledge-based conceptualization. In IJCAI, 2015.

Digital Library

[46]

W. Wu, H. Li, H. Wang, and K. Q. Zhu. Probase: A probabilistic taxonomy for text understanding. In SIGMOD 2012, pages 481--492. ACM, 2012.

Digital Library

Cited By

Ilievski FShenoy KChalupsky HKlein NSzekely P(2024)A study of concept similarity in WikidataSemantic Web10.3233/SW-233520(1-20)Online publication date: 8-Jan-2024
https://doi.org/10.3233/SW-233520
Wang JWang BGao JLi XHu YYin B(2024)QDN: A Quadruplet Distributor Network for Temporal Knowledge Graph CompletionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.327423035:10(14018-14030)Online publication date: Oct-2024
https://doi.org/10.1109/TNNLS.2023.3274230
Lu YYang DWang PRosso PCudre-Mauroux P(2024)Schema-Aware Hyper-Relational Knowledge Graph Embeddings for Link PredictionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3323499(1-15)Online publication date: 2024
https://doi.org/10.1109/TKDE.2023.3323499
Show More Cited By

Index Terms

Contextual Text Understanding in Distributional Semantic Space
1. Information systems

Recommendations

A Compositional Distributional Inclusion Hypothesis
Logical Aspects of Computational Linguistics. Celebrating 20 Years of LACL (1996–2016)
Abstract
The distributional inclusion hypothesis provides a pragmatic way of evaluating entailment between word vectors as represented in a distributional model of meaning. In this paper, we extend this hypothesis to the realm of compositional ...
Utilizing Semantic Composition in Distributional Semantic Models for Word Sense Discrimination and Word Sense Disambiguation
ICSC '12: Proceedings of the 2012 IEEE Sixth International Conference on Semantic Computing

Semantic composition in distributional semantic models (DSMs) offers a powerful tool to represent word meaning in context. In this paper, we investigate methods to utilize compositional DSMs to improve word sense discrimination and word sense ...
Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and dictionary

Word sense disambiguation (WSD) is meant to assign the most appropriate sense to a polysemous word according to its context. We present a method for automatic WSD using only two resources: a raw text corpus and a machine-readable dictionary (MRD). The ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

October 2015

1998 pages

ISBN:9781450337946

DOI:10.1145/2806416

General Chairs:
James Bailey
The University of Melbourne
,
Alistair Moffat
The University of Melbourne
,
Program Chairs:
Charu C. Aggarwal
IBM
,
Maarten de Rijke
University of Amsterdam
,
Ravi Kumar
Google
,
Vanessa Murdock
Microsoft
,
Timos Sellis
RMIT University
,
Jeffrey Xu Yu
Chinese University of Hong Kong

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

the Fundamental Research Funds for the Central Universities & the Research Funds of Renmin University of China
the National Key Basic Research Program (973 Program) of China

Conference

CIKM'15

Sponsor:

CIKM'15: 24th ACM International Conference on Information and Knowledge Management

October 18 - 23, 2015

Melbourne, Australia

Acceptance Rates

CIKM '15 Paper Acceptance Rate 165 of 646 submissions, 26%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

30
Total Citations
View Citations
537
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)1

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ilievski FShenoy KChalupsky HKlein NSzekely P(2024)A study of concept similarity in WikidataSemantic Web10.3233/SW-233520(1-20)Online publication date: 8-Jan-2024
https://doi.org/10.3233/SW-233520
Wang JWang BGao JLi XHu YYin B(2024)QDN: A Quadruplet Distributor Network for Temporal Knowledge Graph CompletionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.327423035:10(14018-14030)Online publication date: Oct-2024
https://doi.org/10.1109/TNNLS.2023.3274230
Lu YYang DWang PRosso PCudre-Mauroux P(2024)Schema-Aware Hyper-Relational Knowledge Graph Embeddings for Link PredictionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3323499(1-15)Online publication date: 2024
https://doi.org/10.1109/TKDE.2023.3323499
Wang JWang BGao JHu YYin B(2023)Multi-Concept Representation Learning for Knowledge Graph CompletionACM Transactions on Knowledge Discovery from Data10.1145/353301717:1(1-19)Online publication date: 20-Feb-2023
https://dl.acm.org/doi/10.1145/3533017
Wang JWang BGao JLi XHu YYin B(2023)TDN: Triplet Distributor Network for Knowledge Graph CompletionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.327256835:12(13002-13014)Online publication date: 1-Dec-2023
https://doi.org/10.1109/TKDE.2023.3272568
Zhao YChao KLi Y(2023)Domain Information Learning For Multi-Domain Knowledge Graph Link Prediction2023 IEEE International Conference on e-Business Engineering (ICEBE)10.1109/ICEBE59045.2023.00023(246-252)Online publication date: 4-Nov-2023
https://doi.org/10.1109/ICEBE59045.2023.00023
Zhou XNiu LZhu QZhu XLiu PTan JGuo L(2022)Knowledge Graph Embedding by Double Limit Scoring LossIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.306075534:12(5825-5839)Online publication date: 1-Dec-2022
https://doi.org/10.1109/TKDE.2021.3060755
Najafipour SHosseini SHua WKangavari MZhou X(2022)SoulMate: Short-Text Author Linking Through Multi-Aspect Temporal-Textual EmbeddingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.298214834:1(448-461)Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1109/TKDE.2020.2982148
Wang ZZheng YZhu HYang CChen T(2022)Transferable adversarial examples can efficiently fool topic modelsComputers and Security10.1016/j.cose.2022.102749118:COnline publication date: 1-Jul-2022
https://dl.acm.org/doi/10.1016/j.cose.2022.102749
Demidovskij A(2021)Encoding and Decoding of Recursive Structures in Neural-Symbolic SystemsOptical Memory and Neural Networks10.3103/S1060992X2101003330:1(37-50)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.3103/S1060992X21010033
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents