[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Petr Hyner 1 ; 2 ; Petr Marek 3 ; David Adamczyk 2 ; 1 ; Jan Hůla 3 ; 2 and Jan Šedivý 3

Affiliations: 1 Department of Informatics and Computers, Faculty of Science, University of Ostrava, Ostrava, Czech Republic ; 2 Institute for Research and Applications of Fuzzy Modeling, University of Ostrava, Ostrava, Czech Republic ; 3 Czech Technical University in Prague, Prague, Czech Republic

Keyword(s): Language Models, Neural Networks, Transfer Learning, Vocabulary Swap.

Abstract: We present a simple approach for efficiently adapting pre-trained English language models to generate text in lower-resource language, specifically Czech. We propose a vocabulary swap method that leverages parallel corpora to map tokens between languages, allowing the model to retain much of its learned capabilities. Experiments conducted on a Czech translation of the TinyStories dataset demonstrate that our approach significantly outperforms baseline methods, especially when using small amounts of training data. With only 10% of the data, our method achieves a perplexity of 17.89, compared to 34.19 for the next best baseline. We aim to contribute to work in the field of cross-lingual transfer in natural language processing and we propose a simple to implement, computationally efficient method tested in a controlled environment.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 79.170.44.78

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Hyner, P. ; Marek, P. ; Adamczyk, D. ; Hůla, J. and Šedivý, J. (2024). Stealing Brains: From English to Czech Language Model. In Proceedings of the 16th International Joint Conference on Computational Intelligence - NCTA; ISBN 978-989-758-721-4; ISSN 2184-3236, SciTePress, pages 606-612. DOI: 10.5220/0013064500003837

@conference{ncta24,
author={Petr Hyner and Petr Marek and David Adamczyk and Jan Hůla and Jan Šedivý},
title={Stealing Brains: From English to Czech Language Model},
booktitle={Proceedings of the 16th International Joint Conference on Computational Intelligence - NCTA},
year={2024},
pages={606-612},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013064500003837},
isbn={978-989-758-721-4},
issn={2184-3236},
}

TY - CONF

JO - Proceedings of the 16th International Joint Conference on Computational Intelligence - NCTA
TI - Stealing Brains: From English to Czech Language Model
SN - 978-989-758-721-4
IS - 2184-3236
AU - Hyner, P.
AU - Marek, P.
AU - Adamczyk, D.
AU - Hůla, J.
AU - Šedivý, J.
PY - 2024
SP - 606
EP - 612
DO - 10.5220/0013064500003837
PB - SciTePress

<style> #socialicons>a span { top: 0px; left: -100%; -webkit-transition: all 0.3s ease; -moz-transition: all 0.3s ease-in-out; -o-transition: all 0.3s ease-in-out; -ms-transition: all 0.3s ease-in-out; transition: all 0.3s ease-in-out;} #socialicons>ahover div{left: 0px;} </style>