[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.3115/1075096.1075100dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Fast methods for kernel-based text analysis

Published: 07 July 2003 Publication History

Abstract

Kernel-based learning (e.g., Support Vector Machines) has been successfully applied to many hard problems in Natural Language Processing (NLP). In NLP, although feature combinations are crucial to improving performance, they are heuristically selected. Kernel methods change this situation. The merit of the kernel methods is that effective feature combination is implicitly expanded without loss of generality and increasing the computational costs. Kernel-based text analysis shows an excellent performance in terms in accuracy; however, these methods are usually too slow to apply to large-scale text analysis. In this paper, we extend a Basket Mining algorithm to convert a kernel-based classifier into a simple and fast linear classifier. Experimental results on English BaseNP Chunking, Japanese Word Segmentation and Japanese Dependency Parsing show that our new classifiers are about 30 to 300 times faster than the standard kernel-based classifiers.

References

[1]
Junichi Aoe. 1989. An efficient digital search algorithm by using a double-array structure. IEEE Transactions on Software Engineering, 15(9).
[2]
Michael Collins and Nigel Duffy. 2001. Convolution kernels for natural language. In Advances in Neural Information Processing Systems 14, Vol.1 (NIPS 2001), pages 625--632.
[3]
Hideki Isozaki and Hideto Kazawa. 2002. Efficient support vector classifiers for named entity recognition. In Proceedings of the COLING-2002, pages 390--396.
[4]
Hisashi Kashima and Teruo Koyanagi. 2002. Svm kernels for semi-structured data. In Proceedings of the ICML-2002, pages 291--298.
[5]
Taku Kudo and Yuji Matsumoto. 2000. Japanese Dependency Structure Analysis based on Support Vector Machines. In Proceedings of the EMNLP/VLC-2000, pages 18--25.
[6]
Taku Kudo and Yuji Matsumoto. 2001. Chunking with support vector machines. In Proceedings of the the NAACL, pages 192--199.
[7]
Taku Kudo and Yuji Matsumoto. 2002. Japanese dependency analyisis using cascaded chunking. In Proceedings of the CoNLL-2002, pages 63--69.
[8]
Sadao Kurohashi and Makoto Nagao. 1997. Kyoto University text corpus project. In Proceedings of the ANLP-1997, pages 115--118.
[9]
Huma Lodhi, Craig Saunders, John Shawe-Taylor, Nello Cristianini, and Chris Watkins. 2002. Text classification using string kernels. Journal of Machine Learning Research, 2.
[10]
Tetsuji Nakagawa, Taku Kudo, and Yuji Matsumoto. 2002. Revision learning and its application to part-of-speech tagging. In Proceedings of the ACL 2002, pages 497--504.
[11]
Jian Pei, Jiawei Han, and et al. 2001. Prefixspan: Mining sequential patterns by prefix-projected growth. In Proc. of International Conference of Data Engineering, pages 215--224.
[12]
Lance A. Ramshaw and Mitchell P. Marcus. 1995. Text chunking using transformation-based learning. In Proceedings of the VLC, pages 88--94.
[13]
Vladimir N. Vapnik. 1995. The Nature of Statistical Learning Theory. Springer.
[14]
Mohammed Zaki. 2002. Efficiently mining frequent trees in a forest. In Proceedings of the 8th International Conference on Knowledge Discovery and Data Mining KDD, pages 71--80.

Cited By

View all
  • (2018)Personalized Context-Aware Point of Interest RecommendationACM Transactions on Information Systems10.1145/323193336:4(1-28)Online publication date: 3-Oct-2018
  • (2018)Linguistic Knowledge-Aware Neural Machine TranslationIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2018.286464826:12(2341-2354)Online publication date: 1-Dec-2018
  • (2017)ELSABETProceedings of the 19th International Conference on Information Integration and Web-based Applications & Services10.1145/3151759.3151805(558-562)Online publication date: 4-Dec-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
ACL '03: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
July 2003
571 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 07 July 2003

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)86
  • Downloads (Last 6 weeks)9
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Personalized Context-Aware Point of Interest RecommendationACM Transactions on Information Systems10.1145/323193336:4(1-28)Online publication date: 3-Oct-2018
  • (2018)Linguistic Knowledge-Aware Neural Machine TranslationIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2018.286464826:12(2341-2354)Online publication date: 1-Dec-2018
  • (2017)ELSABETProceedings of the 19th International Conference on Information Integration and Web-based Applications & Services10.1145/3151759.3151805(558-562)Online publication date: 4-Dec-2017
  • (2016)Field-aware Factorization Machines for CTR PredictionProceedings of the 10th ACM Conference on Recommender Systems10.1145/2959100.2959134(43-50)Online publication date: 7-Sep-2016
  • (2015)JAMREDProceedings of the 17th International Conference on Information Integration and Web-based Applications & Services10.1145/2837185.2837246(1-5)Online publication date: 11-Dec-2015
  • (2015)JILLProceedings of the 17th International Conference on Information Integration and Web-based Applications & Services10.1145/2837185.2837191(1-9)Online publication date: 11-Dec-2015
  • (2013)Finite rank kernels for multi-task learningAdvances in Computational Mathematics10.1007/s10444-011-9244-x38:2(427-439)Online publication date: 1-Feb-2013
  • (2013)Searching emotional scenes in TV programs based on twitter emotion analysisProceedings of the 5th international conference on Online Communities and Social Computing10.1007/978-3-642-39371-6_48(432-441)Online publication date: 21-Jul-2013
  • (2013)Supervised Learning and Distributional Semantic Models for Super-Sense TaggingProceeding of the XIIIth International Conference on AI*IA 2013: Advances in Artificial Intelligence - Volume 824910.1007/978-3-319-03524-6_9(97-108)Online publication date: 4-Dec-2013
  • (2012)Statistical modality tagging from rule-based annotations and crowdsourcingProceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics10.5555/2392701.2392708(57-64)Online publication date: 13-Jul-2012
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media