Heuristic Algorithm for Zero Subject Detection in Polish

Adam Kaczmarek^15,16 &
Michał Marcińczuk¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9302))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

1846 Accesses
5 Citations

Abstract

This article describes a heuristic approach to zero subject detection in Polish. It focuses on the zero subject detection as a crucial step in end-to-end coreference resolution. The zero subject verbs are recognized using a set of manually created rules utilizing information from different sources, including: a dependency parser, a shallow relational parser and a valence dictionary. The rules were developed and evaluated on the Polish Coreference Corpus. The experimental results show that the presented method significantly outperforms the only machine learning-based alternative for Polish, i.e., MentionDetector. We also discuss and evaluate the importance of zero subject detection for existing coreference resolution tools for Polish.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Detection of Nested Mentions for Coreference Resolution in Polish

Coreference Annotation Schema for an Inflectional Language

Nominal Coreference Resolution Using Semantic Knowledge

References

Broda, B., Marcińczuk, M., Maziarz, M., Radziszewski, A., Wardyński, A.: KPWr: towards a free corpus of Polish. In: Calzolari, N., Choukri, K., Declerck, T., Doğan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of LREC 2012, Istanbul, Turkey. ELRA (2012)
Google Scholar
Chomsky, N.: Lectures on government and binding. In: The Pisa Lectures. Foris Publications, Holland (1981)
Google Scholar
Russo, L., Loáiciga, S., Gulati, A.: Improving machine translation of null subjects in italian and spanish. In: Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, pp. 81–89. Association for Computational Linguistics, April 2012
Google Scholar
Rello, L., Ferraro, G., Gayo, I.: A first approach to the automatic detection of zero subjects and impersonal constructions in portuguese. Procesamiento del Lenguaje Natural 49, 163–170 (2012)
Google Scholar
Mihăilă, C., Ilisei, I., Inkpen, D.: Zero pronominal anaphora resolution for the romanian language
Google Scholar
Kopeć, M.: Zero subject detection for Polish. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. Short Papers, Gothenburg, Sweden, vol. 2, pp. 221–225. Association for Computational Linguistics (2014)
Google Scholar
Ogrodniczuk, M., Głowińska, K., Kopeć, M., Savary, A., Zawisławska, M.: Polish coreference corpus. In: Vetulani, Z. (ed.) Proceedings of the 6th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Poznań, Poland, Wydawnictwo Poznańskie, Fundacja Uniwersytetu im, pp. 494–498. Adama Mickiewicza (2013)
Google Scholar
Radziszewski, A.: A tiered CRF tagger for Polish. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Kryszkiewicz, M., Niezgódka, M. (eds.) Intell. Tools for Building a Scientific Information. SCI, vol. 467, pp. 215–230. Springer, Heidelberg (2013)
Chapter Google Scholar
Przepiórkowski, A., Hajnicz, E., Patejuk, A., Woliński, M., Skwarski, F., Świdziński, M.: Walenty: towards a comprehensive valence dictionary of polish. In: Chair, N.C.C., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), Reykjavik, Iceland. European Language Resources Association (ELRA), May 2014
Google Scholar
Nivre, J., Hall, J., Nilsson, J.: Maltparser: a data-driven parser-generator for dependency parsing. In: Proc. of LREC-2006, pp. 2216–2219 (2006)
Google Scholar
Wróblewska, A.: Polish dependency bank. Linguistic Issues in Language Technology 7(1) (2012)
Google Scholar
Radziszewski, A., Orłowicz, P., Broda, B.: Classification of predicate-argument relations in Polish data. In: Kłopotek, M.A., Koronacki, J., Marciniak, M., Mykowiecka, A., Wierzchoń, S.T. (eds.) IIS 2013. LNCS, vol. 7912, pp. 28–38. Springer, Heidelberg (2013)
Chapter Google Scholar
Ogrodniczuk, M., Kopeć, M.: Rule-based coreference resolution module for Polish. In: Proceedings of the 8th Discourse Anaphora and Anaphor Resolution Colloquium (DAARC 2011), Faro, Portugal, pp. 191–200 (2011)
Google Scholar
Kopeć, M., Ogrodniczuk, M.: Creating a coreference resolution system for Polish. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation, LREC 2012, Istanbul, Turkey, pp. 192–195. ELRA (2012)
Google Scholar
Broda, B., Burdka, L., Maziarz, M.: IKAR: an improved kit for anaphora resolution for Polish. In: Proceedings of COLING 2012: Demonstration Papers, Mumbai, India, pp. 25–32. The COLING 2012 Organizing Committee, December 2012
Google Scholar
Marcińczuk, M., Kocoń, J., Janicki, M.: Liner2 — a customizable framework for proper names recognition for Polish. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Kryszkiewicz, M., Niezgódka, M. (eds.) Intell. Tools for Building a Scientific Information. SCI, vol. 467, pp. 231–254. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Computational Intelligence Reserach Group, Institute of Computer Science, University of Wrocław, Wrocław, Poland
Adam Kaczmarek
G4.19 Research Group: Computational Linguistics and Language Technology, Department of Computational Intelligence, Wrocław University of Technology, Wrocław, Poland
Adam Kaczmarek & Michał Marcińczuk

Authors

Adam Kaczmarek
View author publications
You can also search for this author in PubMed Google Scholar
Michał Marcińczuk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adam Kaczmarek .

Editor information

Editors and Affiliations

University of West Bohemia, Pilsen, Czech Republic
Pavel Král
University of West Bohemia, Pilsen, Czech Republic
Václav Matoušek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kaczmarek, A., Marcińczuk, M. (2015). Heuristic Algorithm for Zero Subject Detection in Polish. In: Král, P., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2015. Lecture Notes in Computer Science(), vol 9302. Springer, Cham. https://doi.org/10.1007/978-3-319-24033-6_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-24033-6_43
Published: 11 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24032-9
Online ISBN: 978-3-319-24033-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Heuristic Algorithm for Zero Subject Detection in Polish

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Detection of Nested Mentions for Coreference Resolution in Polish

Coreference Annotation Schema for an Inflectional Language

Nominal Coreference Resolution Using Semantic Knowledge

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Heuristic Algorithm for Zero Subject Detection in Polish

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Detection of Nested Mentions for Coreference Resolution in Polish

Coreference Annotation Schema for an Inflectional Language

Nominal Coreference Resolution Using Semantic Knowledge

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation