[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2661829.2661955acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Semantic Compositionality in Tree Kernels

Published: 03 November 2014 Publication History

Abstract

Kernel-based learning has been largely applied to semantic textual inference tasks. In particular, Tree Kernels (TKs) are crucial in the modeling of syntactic similarity between linguistic instances in Question Answering or Information Extraction tasks. At the same time, lexical semantic information has been studied through the adoption of the so-called Distributional Semantics (DS) paradigm, where lexical vectors are acquired automatically from large corpora. Notice how methods to account for compositional linguistic structures (e.g. grammatically typed bi-grams or complex verb or noun phrases) have been proposed recently by defining algebras on lexical vectors. The result is an extended paradigm called Distributional Compositional Semantics (DCS). Although lexical extensions have been already proposed to generalize TKs towards semantic phenomena (e.g. the predicate argument structures as for role labeling), currently studied TKs do not account for compositionality, in general. In this paper, a novel kernel called Compositionally Smoothed Partial Tree Kernel is proposed to integrate DCS operators into the tree kernel evaluation, by acting both over lexical leaves and non-terminal, i.e. complex compositional, nodes. The empirical results obtained on a Question Classification and Paraphrase Identification tasks show that state-of-the-art performances can be achieved, without resorting to manual feature engineering, thus suggesting that a large set of Web and text mining tasks can be handled successfully by the kernel proposed here.

References

[1]
P. Annesi, V. Storch, and R. Basili. Space projections as distributional models for semantic composition. In In Proceedings of CICLing 2012, volume 7181 of Lecture Notes in Computer Science, pages 323--335. Springer, 2012.
[2]
M. Baroni, S. Bernardini, A. Ferraresi, and E. Zanchetta. The wacky wide web: a collection of very large linguistically processed web-crawled corpora. Language Resources and Evaluation, 43(3):209--226, 2009.
[3]
M. Baroni and A. Lenci. Distributional memory: A general framework for corpus-based semantics. Computational Linguistics, 36(4):673--721, 2010.
[4]
W. Blacoe and M. Lapata. A comparison of vector-based representations for semantic composition. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL '12, pages 546--556, Stroudsburg, PA, USA, 2012. Association for Computational Linguistics.
[5]
M. Collins and N. Duffy. Convolution kernels for natural language. In Proceedings of Neural Information Processing Systems (NIPS), pages 625--632, 2001.
[6]
A. Copestake, D. Flickinger, C. Pollard, and I. Sag. Minimal Recursion Semantics: An Introduction. Research on Language & Computation, 3(2--3):281--332, Dec. 2005.
[7]
D. Croce, P. Annesi, V. Storch, and R. Basili. Unitor: Combining semantic text similarity functions through sv regression. In Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, pages 597--602. Association for Computational Linguistics, 2012.
[8]
D. Croce, S. Filice, and R. Basili. Distributional models and lexical semantics in convolution kernels. In CICLing (1), pages 336--348, 2012.
[9]
D. Croce, A. Moschitti, and R. Basili. Structured lexical similarity via convolution kernels on dependency trees. In Proceedings of EMNLP, Edinburgh, Scotland, UK., 2011.
[10]
B. Dolan, C. Quirk, and C. Brockett. Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In COLING, 2004.
[11]
K. Erk and S. Pado. A structured vector space model for word meaning in context. In EMNLP '08: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 897--906. ACL, 2008.
[12]
A. Finch, Y. S. Hwang, and E. Sumita. Using machine translation evaluation techniques to determine sentence-level semantic equivalence. In IWP2005, 2005.
[13]
P. Foltz, W. Kintsch, and T. Landauer. The measurement of textual coherence with latent semantic analysis. Discourse Processes, 25:285--307, 1998.
[14]
E. Grefenstette and M. Sadrzadeh. Experimental support for a categorical compositional distributional model of meaning. In EMNLP, pages 1394--1404. ACL, 2011.
[15]
E. Guevara. A regression model of adjective-noun compositionality in distributional semantics. In Proceedings of the GEMS '10, pages 33--37, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics.
[16]
D. Haussler. Convolution kernels on discrete structures. Technical report, Dept. of Computer Science, University of California at Santa Cruz, 1999.
[17]
E. Hovy, L. Gerber, U. Hermjakob, C.-Y. Lin, and D. Ravichandran. Toward semantics-based answer pinpointing. In Proceedings of the First International Conference on Human Language Technology Research, HLT '01, pages 1--7, Stroudsburg, PA, USA, 2001. Association for Computational Linguistics.
[18]
P. Jiang, C. Zhang, H. Fu, Z. Niu, and Q. Yang. An approach based on tree kernels for opinion mining of online product reviews. In Data Mining (ICDM), 2010 IEEE 10th International Conference on, pages 256--265, Dec 2010.
[19]
T. Joachims. Estimating the generalization performance of an svm efficiently. In Proceedings of ICML'00, pages 431--438, San Francisco, CA, USA, 2000.
[20]
W. Kintsch. Predication. Cognitive Science, 25(2):173--202, 2001.
[21]
T. Landauer and S. Dumais. A solution to plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological review, pages 211--240, 1997.
[22]
X. Li and D. Roth. Learning question classifiers. In Proceedings of ACL '02, COLING '02, pages 1--7, Stroudsburg, PA, USA, 2002. Association for Computational Linguistics.
[23]
R. Mihalcea, C. Corley, and C. Strapparava. Corpus-based and knowledge-based measures of text semantic similarity. In Proceedings of the 21st National Conference on Artificial Intelligence - Volume 1, AAAI'06, pages 775--780. AAAI Press, 2006.
[24]
J. Mitchell and M. Lapata. Vector-based models of semantic composition. In In Proceedings of ACL/HLT 2008, pages 236--244, 2008.
[25]
J. Mitchell and M. Lapata. Composition in distributional models of semantics. Cognitive Science, 34(8):1388--1429, 2010.
[26]
A. Moschitti. Kernel methods, syntax and semantics for relational text categorization. In ECML, pages 318--329, Berlin, Germany, September 2008. Machine Learning: ECML 2006, 17th European Conference on Machine Learning, Proceedings.
[27]
A. Moschitti, S. Quarteroni, R. Basili, and S. Manandhar. Exploiting syntactic and shallow semantic kernels for question answer classification. In ACL. The Association for Computer Linguistics, 2007.
[28]
S. Pado and M. Lapata. Dependency-based construction of semantic space models. Computational Linguistics, 33(2), 2007.
[29]
M. Sahlgren. The Word-Space Model. PhD thesis, Stockholm University, 2006.
[30]
H. Schütze. Automatic word sense discrimination. Comput. Linguist., 24(1):97--123, Mar. 1998.
[31]
P. Smolensky. Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artif. Intell., 46:159--216, November 1990.
[32]
R. Socher, J. Bauer, C. D. Manning, and N. Andrew Y. Parsing with compositional vector grammars. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 455--465, Sofia, Bulgaria, August 2013. Association for Computational Linguistics.
[33]
R. Socher, E. H. Huang, J. Pennington, A. Y. Ng, and C. D. Manning. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. C. N. Pereira, and K. Q. Weinberger, editors, NIPS, pages 801--809, 2011.
[34]
S. Srivastava, D. Hovy, and E. H. Hovy. A walk-based semantically enriched tree kernel over distributed word representations. In EMNLP, pages 1411--1416, 2013.
[35]
P. D. Turney and P. Pantel. From frequency to meaning: Vector space models of semantics. Journal of artificial intelligence research, 37:141, 2010.
[36]
E. M. Voorhees. Overview of the trec 2001 question answering track. In TREC, 2001.
[37]
F. M. Zanzotto, I. Korkontzelos, F. Fallucchi, and S. Manandhar. Estimating linear models for compositional distributional semantics. In Proceedings of the 23rd International Conference on Computational Linguistics, COLING '10, pages 1263--1271, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics.

Cited By

View all
  • (2020)Few-Shot Learning for Chinese Legal Controversial Issues ClassificationIEEE Access10.1109/ACCESS.2020.29884938(75022-75034)Online publication date: 2020
  • (2019)Bayesian Non-Parametric Classification With Tree-Based Feature Transformation for NIPPV Efficacy Prediction in COPD PatientsIEEE Access10.1109/ACCESS.2019.29580477(177774-177783)Online publication date: 2019
  • (2019)Kernel-Based Generative Adversarial Networks for Weakly Supervised LearningAI*IA 2019 – Advances in Artificial Intelligence10.1007/978-3-030-35166-3_24(336-347)Online publication date: 12-Nov-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
November 2014
2152 pages
ISBN:9781450325981
DOI:10.1145/2661829
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. compositional distributional semantics
  2. kernel machines
  3. natural language processing
  4. tree kernel

Qualifiers

  • Research-article

Conference

CIKM '14
Sponsor:

Acceptance Rates

CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Few-Shot Learning for Chinese Legal Controversial Issues ClassificationIEEE Access10.1109/ACCESS.2020.29884938(75022-75034)Online publication date: 2020
  • (2019)Bayesian Non-Parametric Classification With Tree-Based Feature Transformation for NIPPV Efficacy Prediction in COPD PatientsIEEE Access10.1109/ACCESS.2019.29580477(177774-177783)Online publication date: 2019
  • (2019)Kernel-Based Generative Adversarial Networks for Weakly Supervised LearningAI*IA 2019 – Advances in Artificial Intelligence10.1007/978-3-030-35166-3_24(336-347)Online publication date: 12-Nov-2019
  • (2018)LPTK: a linguistic pattern-aware dependency tree kernel approach for the BioCreative VI CHEMPROT taskDatabase10.1093/database/bay1082018Online publication date: 22-Oct-2018
  • (2017)KeLPThe Journal of Machine Learning Research10.5555/3122009.324204818:1(6993-6997)Online publication date: 1-Jan-2017
  • (2017)Effective and scalable kernel-based language learning via stratified Nyström methodsIntelligenza Artificiale10.3233/IA-17010911:2(93-116)Online publication date: 5-Dec-2017
  • (2017)On the Impact of Linguistic Information in Kernel-Based Deep ArchitecturesAI*IA 2017 Advances in Artificial Intelligence10.1007/978-3-319-70169-1_27(359-371)Online publication date: 7-Nov-2017
  • (2016)SPIRITIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.256662028:9(2494-2507)Online publication date: 1-Sep-2016
  • (2015)Short Text Similarity with Word EmbeddingsProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806475(1411-1420)Online publication date: 17-Oct-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media