More Web Proxy on the site http://driver.im/

research-article

Semantic Compositionality in Tree Kernels

Authors:

Roberto BasiliAuthors Info & Claims

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

Pages 1029 - 1038

https://doi.org/10.1145/2661829.2661955

Published: 03 November 2014 Publication History

Abstract

Kernel-based learning has been largely applied to semantic textual inference tasks. In particular, Tree Kernels (TKs) are crucial in the modeling of syntactic similarity between linguistic instances in Question Answering or Information Extraction tasks. At the same time, lexical semantic information has been studied through the adoption of the so-called Distributional Semantics (DS) paradigm, where lexical vectors are acquired automatically from large corpora. Notice how methods to account for compositional linguistic structures (e.g. grammatically typed bi-grams or complex verb or noun phrases) have been proposed recently by defining algebras on lexical vectors. The result is an extended paradigm called Distributional Compositional Semantics (DCS). Although lexical extensions have been already proposed to generalize TKs towards semantic phenomena (e.g. the predicate argument structures as for role labeling), currently studied TKs do not account for compositionality, in general. In this paper, a novel kernel called Compositionally Smoothed Partial Tree Kernel is proposed to integrate DCS operators into the tree kernel evaluation, by acting both over lexical leaves and non-terminal, i.e. complex compositional, nodes. The empirical results obtained on a Question Classification and Paraphrase Identification tasks show that state-of-the-art performances can be achieved, without resorting to manual feature engineering, thus suggesting that a large set of Web and text mining tasks can be handled successfully by the kernel proposed here.

References

[1]

P. Annesi, V. Storch, and R. Basili. Space projections as distributional models for semantic composition. In In Proceedings of CICLing 2012, volume 7181 of Lecture Notes in Computer Science, pages 323--335. Springer, 2012.

Digital Library

[2]

M. Baroni, S. Bernardini, A. Ferraresi, and E. Zanchetta. The wacky wide web: a collection of very large linguistically processed web-crawled corpora. Language Resources and Evaluation, 43(3):209--226, 2009.

[3]

M. Baroni and A. Lenci. Distributional memory: A general framework for corpus-based semantics. Computational Linguistics, 36(4):673--721, 2010.

Digital Library

[4]

W. Blacoe and M. Lapata. A comparison of vector-based representations for semantic composition. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL '12, pages 546--556, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics.

Digital Library

[5]

M. Collins and N. Duffy. Convolution kernels for natural language. In Proceedings of Neural Information Processing Systems (NIPS), pages 625--632, 2001.

[6]

A. Copestake, D. Flickinger, C. Pollard, and I. Sag. Minimal Recursion Semantics: An Introduction. Research on Language & Computation, 3(2--3):281--332, Dec. 2005.

[7]

D. Croce, P. Annesi, V. Storch, and R. Basili. Unitor: Combining semantic text similarity functions through sv regression. In Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, pages 597--602. Association for Computational Linguistics, 2012.

Digital Library

[8]

D. Croce, S. Filice, and R. Basili. Distributional models and lexical semantics in convolution kernels. In CICLing (1), pages 336--348, 2012.

Digital Library

[9]

D. Croce, A. Moschitti, and R. Basili. Structured lexical similarity via convolution kernels on dependency trees. In Proceedings of EMNLP, Edinburgh, Scotland, UK., 2011.

Digital Library

[10]

B. Dolan, C. Quirk, and C. Brockett. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In COLING, 2004.

Digital Library

[11]

K. Erk and S. Pado. A structured vector space model for word meaning in context. In EMNLP '08: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 897--906. ACL, 2008.

Digital Library

[12]

A. Finch, Y. S. Hwang, and E. Sumita. Using machine translation evaluation techniques to determine sentence-level semantic equivalence. In IWP2005, 2005.

[13]

P. Foltz, W. Kintsch, and T. Landauer. The measurement of textual coherence with latent semantic analysis. Discourse Processes, 25:285--307, 1998.

[14]

E. Grefenstette and M. Sadrzadeh. Experimental support for a categorical compositional distributional model of meaning. In EMNLP, pages 1394--1404. ACL, 2011.

Digital Library

[15]

E. Guevara. A regression model of adjective-noun compositionality in distributional semantics. In Proceedings of the GEMS '10, pages 33--37, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics.

Digital Library

[16]

D. Haussler. Convolution kernels on discrete structures. Technical report, Dept. of Computer Science, University of California at Santa Cruz, 1999.

[17]

E. Hovy, L. Gerber, U. Hermjakob, C.-Y. Lin, and D. Ravichandran. Toward semantics-based answer pinpointing. In Proceedings of the First International Conference on Human Language Technology Research, HLT '01, pages 1--7, Stroudsburg, PA, USA, 2001. Association for Computational Linguistics.

Digital Library

[18]

P. Jiang, C. Zhang, H. Fu, Z. Niu, and Q. Yang. An approach based on tree kernels for opinion mining of online product reviews. In Data Mining (ICDM), 2010 IEEE 10th International Conference on, pages 256--265, Dec 2010.

Digital Library

[19]

T. Joachims. Estimating the generalization performance of an svm efficiently. In Proceedings of ICML'00, pages 431--438, San Francisco, CA, USA, 2000.

Digital Library

[20]

W. Kintsch. Predication. Cognitive Science, 25(2):173--202, 2001.

[21]

T. Landauer and S. Dumais. A solution to plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological review, pages 211--240, 1997.

[22]

X. Li and D. Roth. Learning question classifiers. In Proceedings of ACL '02, COLING '02, pages 1--7, Stroudsburg, PA, USA, 2002. Association for Computational Linguistics.

Digital Library

[23]

R. Mihalcea, C. Corley, and C. Strapparava. Corpus-based and knowledge-based measures of text semantic similarity. In Proceedings of the 21st National Conference on Artificial Intelligence - Volume 1, AAAI'06, pages 775--780. AAAI Press, 2006.

Digital Library

[24]

J. Mitchell and M. Lapata. Vector-based models of semantic composition. In In Proceedings of ACL/HLT 2008, pages 236--244, 2008.

[25]

J. Mitchell and M. Lapata. Composition in distributional models of semantics. Cognitive Science, 34(8):1388--1429, 2010.

[26]

A. Moschitti. Kernel methods, syntax and semantics for relational text categorization. In ECML, pages 318--329, Berlin, Germany, September 2008. Machine Learning: ECML 2006, 17th European Conference on Machine Learning, Proceedings.

Digital Library

[27]

A. Moschitti, S. Quarteroni, R. Basili, and S. Manandhar. Exploiting syntactic and shallow semantic kernels for question answer classification. In ACL. The Association for Computer Linguistics, 2007.

[28]

S. Pado and M. Lapata. Dependency-based construction of semantic space models. Computational Linguistics, 33(2), 2007.

Digital Library

[29]

M. Sahlgren. The Word-Space Model. PhD thesis, Stockholm University, 2006.

[30]

H. Schütze. Automatic word sense discrimination. Comput. Linguist., 24(1):97--123, Mar. 1998.

Digital Library

[31]

P. Smolensky. Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artif. Intell., 46:159--216, November 1990.

Digital Library

[32]

R. Socher, J. Bauer, C. D. Manning, and N. Andrew Y. Parsing with compositional vector grammars. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 455--465, Sofia, Bulgaria, August 2013. Association for Computational Linguistics.

[33]

R. Socher, E. H. Huang, J. Pennington, A. Y. Ng, and C. D. Manning. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. C. N. Pereira, and K. Q. Weinberger, editors, NIPS, pages 801--809, 2011.

[34]

S. Srivastava, D. Hovy, and E. H. Hovy. A walk-based semantically enriched tree kernel over distributed word representations. In EMNLP, pages 1411--1416, 2013.

[35]

P. D. Turney and P. Pantel. From frequency to meaning: Vector space models of semantics. Journal of artificial intelligence research, 37:141, 2010.

Digital Library

[36]

E. M. Voorhees. Overview of the trec 2001 question answering track. In TREC, 2001.

[37]

F. M. Zanzotto, I. Korkontzelos, F. Fallucchi, and S. Manandhar. Estimating linear models for compositional distributional semantics. In Proceedings of the 23rd International Conference on Computational Linguistics, COLING '10, pages 1263--1271, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics.

Digital Library

Cited By

Fang YTian XWu HGu SWang ZWang FLi JWeng Y(2020)Few-Shot Learning for Chinese Legal Controversial Issues ClassificationIEEE Access10.1109/ACCESS.2020.29884938(75022-75034)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.2988493
Weng YFang YYan HYang YHong W(2019)Bayesian Non-Parametric Classification With Tree-Based Feature Transformation for NIPPV Efficacy Prediction in COPD PatientsIEEE Access10.1109/ACCESS.2019.29580477(177774-177783)Online publication date: 2019
https://doi.org/10.1109/ACCESS.2019.2958047
Croce DCastellucci GBasili R(2019)Kernel-Based Generative Adversarial Networks for Weakly Supervised LearningAI*IA 2019 – Advances in Artificial Intelligence10.1007/978-3-030-35166-3_24(336-347)Online publication date: 12-Nov-2019
https://doi.org/10.1007/978-3-030-35166-3_24
Show More Cited By

Index Terms

Semantic Compositionality in Tree Kernels
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. Information retrieval
    1. Document representation
      1. Content analysis and feature selection

Recommendations

Sentence entailment in compositional distributional semantics

Distributional semantic models provide vector representations for words by gathering co-occurrence frequencies from corpora of text. Compositional distributional models extend these from words to phrases and sentences. In categorical compositional ...
Compositionality and the semantics of nominals
Developing a Cross-lingual Semantic Word Similarity Corpus for English–Urdu Language Pair
Semantic word similarity is a quantitative measure of how much two words are contextually similar. Evaluation of semantic word similarity models requires a benchmark corpus. However, despite the millions of speakers and the large digital text of the Urdu ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

November 2014

2152 pages

ISBN:9781450325981

DOI:10.1145/2661829

General Chairs:
Jianzhong Li
Harbin Inst. of Technology
,
X. Sean Wang
Fudan University
,
Program Chairs:
Minos Garofalakis
Technical University of Crete, Greece
,
Ian Soboroff
National Institute of Standards, USA
,
Torsten Suel
New York University, USA
,
Min Wang
Google Research, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM '14

Sponsor:

CIKM '14: 2014 ACM Conference on Information and Knowledge Management

November 3 - 7, 2014

Shanghai, China

Acceptance Rates

CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
289
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)1

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Fang YTian XWu HGu SWang ZWang FLi JWeng Y(2020)Few-Shot Learning for Chinese Legal Controversial Issues ClassificationIEEE Access10.1109/ACCESS.2020.29884938(75022-75034)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.2988493
Weng YFang YYan HYang YHong W(2019)Bayesian Non-Parametric Classification With Tree-Based Feature Transformation for NIPPV Efficacy Prediction in COPD PatientsIEEE Access10.1109/ACCESS.2019.29580477(177774-177783)Online publication date: 2019
https://doi.org/10.1109/ACCESS.2019.2958047
Croce DCastellucci GBasili R(2019)Kernel-Based Generative Adversarial Networks for Weakly Supervised LearningAI*IA 2019 – Advances in Artificial Intelligence10.1007/978-3-030-35166-3_24(336-347)Online publication date: 12-Nov-2019
https://doi.org/10.1007/978-3-030-35166-3_24
Warikoo NChang YHsu W(2018)LPTK: a linguistic pattern-aware dependency tree kernel approach for the BioCreative VI CHEMPROT taskDatabase10.1093/database/bay1082018Online publication date: 22-Oct-2018
https://doi.org/10.1093/database/bay108
Filice SCastellucci GDa San Martino GMoschitti ACroce DBasili R(2017)KeLPThe Journal of Machine Learning Research10.5555/3122009.324204818:1(6993-6997)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.5555/3122009.3242048
Croce DFilice SBasili R(2017)Effective and scalable kernel-based language learning via stratified Nyström methodsIntelligenza Artificiale10.3233/IA-17010911:2(93-116)Online publication date: 5-Dec-2017
https://doi.org/10.3233/IA-170109
Croce DFilice SBasili R(2017)On the Impact of Linguistic Information in Kernel-Based Deep ArchitecturesAI*IA 2017 Advances in Artificial Intelligence10.1007/978-3-319-70169-1_27(359-371)Online publication date: 7-Nov-2017
https://doi.org/10.1007/978-3-319-70169-1_27
Chang YChen CHsu W(2016)SPIRITIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.256662028:9(2494-2507)Online publication date: 1-Sep-2016
https://dl.acm.org/doi/10.1109/TKDE.2016.2566620
Kenter Tde Rijke MBailey JMoffat AAggarwal Cde Rijke MKumar RMurdock VSellis TYu J(2015)Short Text Similarity with Word EmbeddingsProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806475(1411-1420)Online publication date: 17-Oct-2015
https://dl.acm.org/doi/10.1145/2806416.2806475

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents