[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2806416.2806517acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Contextual Text Understanding in Distributional Semantic Space

Published: 17 October 2015 Publication History

Abstract

Representing discrete words in a continuous vector space turns out to be useful for natural language applications related to text understanding. Meanwhile, it poses extensive challenges, one of which is due to the polysemous nature of human language. A common solution (a.k.a word sense induction) is to separate each word into multiple senses and create a representation for each sense respectively. However, this approach is usually computationally expensive and prone to data sparsity, since each sense needs to be managed discriminatively. In this work, we propose a new framework for generating context-aware text representations without diving into the sense space. We model the concept space shared among senses, resulting in a framework that is efficient in both computation and storage. Specifically, the framework we propose is one that: i) projects both words and concepts into the same vector space; ii) obtains unambiguous word representations that not only preserve the uniqueness among words, but also reflect their context-appropriate meanings. We demonstrate the effectiveness of the framework in a number of tasks on text understanding, including word/phrase similarity measurements, paraphrase identification and question-answer relatedness classification.

References

[1]
S. Bartunov, D. Kondrashkin, A. Osokin, and D. Vetrov. Breaking sticks and ambiguities with adaptive skip-gram. arXiv preprint arXiv:1502.07257, 2015.
[2]
Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin. A neural probabilistic language model. The Journal of Machine Learning Research, 3:1137--1155, 2003.
[3]
J. Bian, B. Gao, and T.-Y. Liu. Knowledge-powered deep learning for word embedding. In Machine Learning and Knowledge Discovery in Databases, pages 132--148. Springer, 2014.
[4]
W. Blacoe and M. Lapata. A comparison of vector-based representations for semantic composition. In EMNLP 2012, pages 546--556. Association for Computational Linguistics, 2012.
[5]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003.
[6]
K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD 2008, pages 1247--1250. ACM, 2008.
[7]
A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems, pages 2787--2795, 2013.
[8]
A. Bordes, J. Weston, R. Collobert, and Y. Bengio. Learning structured embeddings of knowledge bases. In Conference on Artificial Intelligence, number EPFL-CONF-192344, 2011.
[9]
C. Chelba, T. Mikolov, M. Schuster, Q. Ge, T. Brants, P. Koehn, and T. Robinson. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005, 2013.
[10]
R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, pages 160--167. ACM, 2008.
[11]
B. Dolan, C. Quirk, and C. Brockett. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In Proceedings of the 20th international conference on Computational Linguistics, page 350. Association for Computational Linguistics, 2004.
[12]
M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences, 95(25):14863--14868, 1998.
[13]
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. Liblinear: A library for large linear classification. The Journal of Machine Learning Research, 9:1871--1874, 2008.
[14]
M. Faruqui, J. Dodge, S. K. Jauhar, C. Dyer, E. Hovy, and N. A. Smith. Retrofitting word vectors to semantic lexicons. arXiv preprint arXiv:1411.4166, 2014.
[15]
Z. S. Harris. Distributional structure. Word, 1954.
[16]
S. Hassan. Measuring semantic relatedness using salient encyclopedic concepts. University of North Texas, 2011.
[17]
W. Hua, Z. Wang, H. Wang, K. Zheng, and X. Zhou. Short text understanding through lexical-semantic analysis. In International Conference on Data Engineering (ICDE), 2015.
[18]
E. H. Huang, R. Socher, C. D. Manning, and A. Y. Ng. Improving word representations via global context and multiple word prototypes. In Proceedings of the 50th ACL. Association for Computational Linguistics, 2012.
[19]
N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188, 2014.
[20]
D. Kartsaklis, M. Sadrzadeh, et al. Prior disambiguation of word tensors for constructing sentence vectors. In EMNLP, pages 1590--1601, 2013.
[21]
D. Kartsaklis, M. Sadrzadeh, and S. Pulman. Separating disambiguation from composition in distributional semantics. In Proceedings of CoNLL, pages 114--123, 2013.
[22]
T. Lee, Z. Wang, H. Wang, and S.-w. Hwang. Attribute extraction and scoring: A probabilistic approach. 2013.
[23]
D. B. Lenat. Cyc: A large-scale investment in knowledge infrastructure. Communications of the ACM, 38(11):33--38, 1995.
[24]
J. Li and D. Jurafsky. Do multi-sense embeddings improve natural language understanding? arXiv preprint arXiv:1506.01070, 2015.
[25]
P. Li, H. Wang, K. Q. Zhu, Z. Wang, and X. Wu. Computing term similarity by large probabilistic isa knowledge. In CIKM, 2013.
[26]
N. Madnani, J. Tetreault, and M. Chodorow. Re-examining machine translation metrics for paraphrase identification. In Proceedings of NAACL, pages 182--190. Association for Computational Linguistics, 2012.
[27]
R. Mihalcea, C. Corley, and C. Strapparava. Corpus-based and knowledge-based measures of text semantic similarity. In AAAI, volume 6, pages 775--780, 2006.
[28]
T. Mikolov, Q. V. Le, and I. Sutskever. Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168, 2013.
[29]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, 2013.
[30]
T. Mikolov, W.-t. Yih, and G. Zweig. Linguistic regularities in continuous space word representations. 2013.
[31]
J. Mitchell and M. Lapata. Vector-based models of semantic composition. In ACL, pages 236--244, 2008.
[32]
J. Mitchell and M. Lapata. Vector-based models of semantic composition. In Proceedings of ACL-08: HLT, pages 236--244, Columbus, Ohio, June 2008. Association for Computational Linguistics.
[33]
A. Mnih and Y. W. Teh. A fast and simple algorithm for training neural probabilistic language models. arXiv preprint arXiv:1206.6426, 2012.
[34]
A. Neelakantan, J. Shankar, A. Passos, and A. McCallum. Efficient nonparametric estimation of multiple embeddings per word in vector space. In Proceedings of EMNLP, 2014.
[35]
L. N. Pina and R. Johansson. A simple and efficient method to generate word sense representations. arXiv preprint arXiv:1412.6045, 2014.
[36]
D. Ramage, E. Rosen, J. Chuang, C. D. Manning, and D. A. McFarland. Topic modeling for the social sciences. In Workshop on Applications for Topic Models, NIPS, 2009.
[37]
J. Reisinger and R. J. Mooney. Multi-prototype vector-space models of word meaning. In NAACL 2010, pages 109--117. Association for Computational Linguistics, 2010.
[38]
H. Schütze. Automatic word sense discrimination. Computational linguistics, 24(1):97--123, 1998.
[39]
R. Socher, D. Chen, C. D. Manning, and A. Ng. Reasoning with neural tensor networks for knowledge base completion. In Advances in Neural Information Processing Systems, pages 926--934, 2013.
[40]
R. Socher, E. H. Huang, J. Pennin, C. D. Manning, and A. Y. Ng. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in Neural Information Processing Systems, pages 801--809, 2011.
[41]
R. Socher, C. C. Lin, C. Manning, and A. Y. Ng. Parsing natural scenes and natural language with recursive neural networks. In Proceedings of (ICML-11), pages 129--136, 2011.
[42]
Y. Song, H. Wang, Z. Wang, H. Li, and W. Chen. Short text conceptualization using a probabilistic knowledgebase. In Proceedings of the Twenty-Second international joint conference on Artificial Intelligence-Volume Volume Three, pages 2330--2336. AAAI Press, 2011.
[43]
F. Wang, Z. Wang, Z. Li, and J.-R. Wen. Concept-based short text classification and ranking. In ACM International Conference on Information and Knowledge Management (CIKM), October 2014.
[44]
Z. Wang, H. Wang, and Z. Hu. Head, modifier, and constraint detection in short texts. In IEEE 30th International Conference on Data Engineering (ICDE). IEEE, 2014.
[45]
Z. Wang, K. Zhao, H. Wang, X. Meng, and J.-R. Wen. Query understanding through knowledge-based conceptualization. In IJCAI, 2015.
[46]
W. Wu, H. Li, H. Wang, and K. Q. Zhu. Probase: A probabilistic taxonomy for text understanding. In SIGMOD 2012, pages 481--492. ACM, 2012.

Cited By

View all
  • (2024)A study of concept similarity in WikidataSemantic Web10.3233/SW-233520(1-20)Online publication date: 8-Jan-2024
  • (2024)QDN: A Quadruplet Distributor Network for Temporal Knowledge Graph CompletionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.327423035:10(14018-14030)Online publication date: Oct-2024
  • (2024)Schema-Aware Hyper-Relational Knowledge Graph Embeddings for Link PredictionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3323499(1-15)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. Contextual Text Understanding in Distributional Semantic Space

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management
    October 2015
    1998 pages
    ISBN:9781450337946
    DOI:10.1145/2806416
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. computational linguistics
    2. distributional models
    3. knowledge representation
    4. machine learning
    5. word sense disambiguation

    Qualifiers

    • Research-article

    Funding Sources

    • the Fundamental Research Funds for the Central Universities & the Research Funds of Renmin University of China
    • the National Key Basic Research Program (973 Program) of China

    Conference

    CIKM'15
    Sponsor:

    Acceptance Rates

    CIKM '15 Paper Acceptance Rate 165 of 646 submissions, 26%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)19
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 20 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A study of concept similarity in WikidataSemantic Web10.3233/SW-233520(1-20)Online publication date: 8-Jan-2024
    • (2024)QDN: A Quadruplet Distributor Network for Temporal Knowledge Graph CompletionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.327423035:10(14018-14030)Online publication date: Oct-2024
    • (2024)Schema-Aware Hyper-Relational Knowledge Graph Embeddings for Link PredictionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3323499(1-15)Online publication date: 2024
    • (2023)Multi-Concept Representation Learning for Knowledge Graph CompletionACM Transactions on Knowledge Discovery from Data10.1145/353301717:1(1-19)Online publication date: 20-Feb-2023
    • (2023)TDN: Triplet Distributor Network for Knowledge Graph CompletionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.327256835:12(13002-13014)Online publication date: 1-Dec-2023
    • (2023)Domain Information Learning For Multi-Domain Knowledge Graph Link Prediction2023 IEEE International Conference on e-Business Engineering (ICEBE)10.1109/ICEBE59045.2023.00023(246-252)Online publication date: 4-Nov-2023
    • (2022)Knowledge Graph Embedding by Double Limit Scoring LossIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.306075534:12(5825-5839)Online publication date: 1-Dec-2022
    • (2022)SoulMate: Short-Text Author Linking Through Multi-Aspect Temporal-Textual EmbeddingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.298214834:1(448-461)Online publication date: 1-Jan-2022
    • (2022)Transferable adversarial examples can efficiently fool topic modelsComputers and Security10.1016/j.cose.2022.102749118:COnline publication date: 1-Jul-2022
    • (2021)Encoding and Decoding of Recursive Structures in Neural-Symbolic SystemsOptical Memory and Neural Networks10.3103/S1060992X2101003330:1(37-50)Online publication date: 1-Jan-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media