Publ3 search

Z6_60GI02O0O8IDC0QEJUJ26TJDI4

Error:

Javascript is disabled in this browser. This page requires Javascript. Modify your browser's settings to allow Javascript to execute. See your browser's documentation for specific instructions.

Publikační činnost

Probíhá načítání, čekejte prosím...

Record type:	stať ve sborníku (D)
Home Department:	Ústav pro výzkum a aplikace fuzzy modelování (94410)
Title:	Stealing Brains: From English to Czech Language Model
Citace	Hyner, P., Marek, P., Adamczyk, D., Hůla, J. a Šedivý, J. Stealing Brains: From English to Czech Language Model. In: IJCCI 2024: 16th International Joint Conference on Computational Intelligence: Proceedings of the 16th International Joint Conference on Computational Intelligence 2024-11-20 Porto. Porto: ScitePress, 2024. s. 606-612. ISBN 978-989-758-721-4.

Subtitle
Publication year:	2024
Obor:
Number of pages:	7
Page from:	606
Page to:	612
Form of publication:	Elektronická verze
ISBN code:	978-989-758-721-4
ISSN code:	2184-3236
Proceedings title:	Proceedings of the 16th International Joint Conference on Computational Intelligence
Proceedings:	Mezinárodní
Publisher name:	ScitePress
Place of publishing:	Porto
Country of Publication:	Sborník vydaný v zahraničí
Název konference:	IJCCI 2024: 16th International Joint Conference on Computational Intelligence
Conference venue:	Porto
Datum zahájení konference:
Typ akce podle státní příslušnosti účastníků:	Celosvětová akce
WoS code:
EID:

Key words in English:	Language Models, Neural Networks, Transfer Learning, Vocabulary Swap.
Annotation in original language:	We present a simple approach for efficiently adapting pre-trained English language models to generate text in lower-resource language, specifically Czech. We propose a vocabulary swap method that leverages parallel corpora to map tokens between languages, allowing the model to retain much of its learned capabilities. Experiments conducted on a Czech translation of the TinyStories dataset demonstrate that our approach significantly outperforms baseline methods, especially when using small amounts of training data. With only 10% of the data, our method achieves a perplexity of 17.89, compared to 34.19 for the next best baseline. We aim to contribute to work in the field of cross-lingual transfer in natural language processing and we propose a simple to implement, computationally efficient method tested in a controlled environment.
Annotation in english language:	We present a simple approach for efficiently adapting pre-trained English language models to generate text in lower-resource language, specifically Czech. We propose a vocabulary swap method that leverages parallel corpora to map tokens between languages, allowing the model to retain much of its learned capabilities. Experiments conducted on a Czech translation of the TinyStories dataset demonstrate that our approach significantly outperforms baseline methods, especially when using small amounts of training data. With only 10% of the data, our method achieves a perplexity of 17.89, compared to 34.19 for the next best baseline. We aim to contribute to work in the field of cross-lingual transfer in natural language processing and we propose a simple to implement, computationally efficient method tested in a controlled environment.

References

Reference

R01:	RIV/61988987:17610/24:A25038L6

Complementary Content

${loading}