Abstract
PolEval is a SemEval-inspired evaluation campaign for natural language processing tools for Polish. Submitted tools compete against one another within certain tasks selected by organizers, using available data and are evaluated according to pre-established procedures. It is organized since 2017 and each year the winning systems become the state-of-the-art in Polish language processing in the respective tasks. In 2019 we have organized six different tasks, creating an even greater opportunity for NLP researchers to evaluate their systems in an objective manner.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
References
Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72 (2005)
Broda, B., Marcińczuk, M., Maziarz, M., Radziszewski, A., Wardyński, A.: KPWr: towards a free corpus of polish. In: Calzolari et al. [3]
Calzolari, N., et al. (eds.): Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012). European Language Resource Association, Istanbul, Turkey (2012)
Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the 2nd International Conference on Human Language Technology Research, pp. 138–145. Morgan Kaufmann Publishers Inc. (2002)
Fiscus, J.: Sclite scoring package version 1.5. US National Institute of Standard Technology (NIST) (1998). http://www.itl.nist.gov/iaui/894.01/tools
Fiscus, J.G., Ajot, J., Michel, M., Garofolo, J.S.: The rich transcription 2006 spring meeting recognition evaluation. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299, pp. 309–322. Springer, Heidelberg (2006). https://doi.org/10.1007/11965152_28
Forcada, M.L., et al.: Apertium: a free/open-source platform for rule-based machine translation. Mach. Transl. 25(2), 127–144 (2011)
Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: International Conference on Machine Learning, pp. 1764–1772 (2014)
Harper, M.: The automatic speech recognition in reverberant environments (ASpIRE) challenge. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 547–554. IEEE (2015)
Kobyliński, Ł., Ogrodniczuk, M.: Results of the PolEval 2017 competition: part-of-speech tagging shared task. In: Vetulani and Paroubek [33], pp. 362–366
Kocoń, J., Marcińczuk, M., Oleksy, M., Bernaś, T., Wolski, M.: Temporal Expressions in Polish Corpus KPWr. Cognit. Stud. Études Cognitives 15 (2015)
Kocoń, J., Oleksy, M., Bernaś, T., Marcińczuk, M.: Results of the PolEval 2019 shared Task 1: recognition and normalization of temporal expressions. In: Proceedings of the PolEval 2019 Workshop (2019)
Koehn, P., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Companion Volume: Proceedings of the Demo and Poster Sessions, pp. 177–180 (2007)
Koržinek, D., Marasek, K., Brocki, Ł., Wołk, K.: Polish read speech corpus for speech tools and services. arXiv preprint arXiv:1706.00245 (2017)
Marasek, K., Koržinek, D., Brocki, Ł: System for automatic transcription of sessions of the polish senate. Arch. Acoust. 39(4), 501–509 (2014)
Marcińczuk, M.: Lemmatization of multi-word common noun phrases and named entities in polish. In: Mitkov, R., Angelova, G. (eds.) Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2017), pp. 483–491. INCOMA Ltd. (2017). https://doi.org/10.26615/978-954-452-049-6_064
Mohri, M., Pereira, F., Riley, M.: Weighted finite-state transducers in speech recognition. Comput. Speech Lang. 16(1), 69–88 (2002)
Moro, A., Navigli, R.: Semeval-2015 Task 13: multilingual all-words sense disambiguation and entity linking. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp. 288–297 (2015)
Ogrodniczuk, M.: The polish sejm corpus. In: Calzolari et al. [3], pp. 2219–2223
Ogrodniczuk, M.: Polish parliamentary corpus. In: Fišer, D., Eskevich, M., de Jong, F. (eds.) Proceedings of the LREC 2018 Workshop ParlaCLARIN: Creating and Using Parliamentary Corpora, pp. 15–19. European Language Resources Association (ELRA), Miyazaki, Japan (2018)
Ogrodniczuk, M., Łukasz Kobyliński (eds.): Proceedings of the PolEval 2019 Workshop. Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland (2019). http://2019.poleval.pl/files/poleval2019.pdf
Ogrodniczuk, M., Kobyliński, Ł. (eds.): Proceedings of the PolEval 2018 Workshop. Institute of Computer Science, Polish Academy of Sciences, Warsaw (2018)
Ogrodniczuk, M., Nitoń, B.: New developments in the polish parliamentary corpus. In: Fišer, D., Eskevich, M., de Jong, F. (eds.) Proceedings of the Second ParlaCLARIN Workshop, pp. 1–4. European Language Resources Association (ELRA), Marseille, France (2020). https://www.aclweb.org/anthology/2020.parlaclarin-1.1
Oleksy, M., Radziszewski, A., Wieczorek, J.: KPWr annotation guidelines - phrase lemmatization (2018). http://hdl.handle.net/11321/591. CLARIN-PL digital repository
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a Method for Automatic Evaluation of Machine Translation. In: Proceedings of the 40th Annual Meeting of Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Pęzik, P.: Increasing the accessibility of time-aligned speech corpora with spokes Mix. In: Calzolari, N., (eds.) Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), pp. 4297–4300. European Languages Resources Association, Miyazaki, Japan (2018). https://www.aclweb.org/anthology/L18-1000
Ptaszynski, M., Eronen, J.K.K., Masui, F.: Learning deep on cyberbullying is always better than brute force. In: IJCAI 2017 3rd Workshop on Linguistic and Cognitive Approaches to Dialogue Agents (LaCATODA 2017), Melbourne, Australia, pp. 19–25 (2017)
Ptaszynski, M., Masui, F.: Automatic Cyberbullying Detection: Emerging Research and Opportunities, 1st edn. IGI Global Publishing, Pennsylvania (2018)
Rosales-Méndez, H., Hogan, A., Poblete, B.: VoxEL: a benchmark dataset for multilingual entity linking. In: Vrandečić, D., Bontcheva, K., Suárez-Figueroa, M.C., Presutti, V., Celino, I., Sabou, M., Kaffee, L.-A., Simperl, E. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 170–186. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00668-6_11
Saurí, R., Littman, J., Gaizauskas, R., Setzer, A., Pustejovsky, J.: TimeML annotation guidelines, version 1.2.1 (2006)
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, Cambridge, Massachusetts, USA, pp. 223–231. Association for Machine Translation in the Americas (2006)
UzZaman, N., et al.: SemEval-2013 Task 1: TempEval-3: evaluating time expressions, events, and temporal relations. In: 2nd Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013), vol. 2, pp. 1–9 (2013)
Vetulani, Z., Paroubek, P. (eds.): Proceedings of the 8th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics. Fundacja Uniwersytetu im. Adama Mickiewicza w Poznaniu, Poznań, Poland (2017)
Vincent, E., Watanabe, S., Barker, J., Marxer, R.: The 4th CHiME speech separation and recognition challenge (2016). http://spandh.dcs.shef.ac.uk/chime_challenge/CHiME4/. Accessed 21 Sept 2021
Wawer, A., Ogrodniczuk, M.: Results of the PolEval 2017 competition: sentiment analysis shared task. In: Vetulani and Paroubek [33], pp. 406–409
Wolk, K., Marasek, K.: Survey on neural machine translation into polish. In: Choroś, K., Kopel, M., Kukla, E., Siemiński, A. (eds.) MISSI 2018. AISC, vol. 833, pp. 260–272. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98678-4_27
Wróbel, K.: KRNNT: polish recurrent neural network tagger. In: Vetulani and Paroubek [33]
Young, S., et al.: The HTK Book. Cambridge University Engineering Department, vol. 3, p. 175 (2002)
Acknowledgements
The work on temporal expression recognition and phrase lemmatization were financed as part of the investment in the CLARIN-PL research infrastructure funded by the Polish Ministry of Science and Higher Education.
The work on Entity Linking was supported by the Polish National Centre for Research and Development – LIDER Program under Grant LIDER/ 27/0164/L-8/16/NCBR/2017 titled “Lemkin - intelligent legal information system” and also supported in part by PLGrid Infrastructure.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Kobyliński, Ł. et al. (2022). Evaluating Natural Language Processing tools for Polish during PolEval 2019. In: Vetulani, Z., Paroubek, P., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2019. Lecture Notes in Computer Science(), vol 13212. Springer, Cham. https://doi.org/10.1007/978-3-031-05328-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-05328-3_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-05327-6
Online ISBN: 978-3-031-05328-3
eBook Packages: Computer ScienceComputer Science (R0)