OU Portal
Log In
Welcome
Applicants
Z6_60GI02O0O8IDC0QEJUJ26TJDI4
Error:
Javascript is disabled in this browser. This page requires Javascript. Modify your browser's settings to allow Javascript to execute. See your browser's documentation for specific instructions.
{}
Zavřít
Publikační činnost
Probíhá načítání, čekejte prosím...
publicationId :
tempRecordId :
actionDispatchIndex :
navigationBranch :
pageMode :
tabSelected :
isRivValid :
Typ záznamu
*
:
stať ve sborníku (D)
Domácí pracoviště
*
:
Katedra českého jazyka (25300)
Název
*
:
The SIGMORPHON 2022 Shared Task on Morpheme Segmentation
Citace :
Batsuren, K., Bella, G., Arora, A., Martinovic, V., Gorman, K., Žabokrtský, Z., Ganbold, A., Dohnalová, Š., Ševčíková, M., Pelegrinová, K., Giunchiglia, F., Cotterell, R. a Vylomova, E. The SIGMORPHON 2022 Shared Task on Morpheme Segmentation.
In:
Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology 2022-07-14 Seattle, Washington.
Seattle, Washington: Association for Computational Linguistics, 2022. s. 103-116. ISBN 978-1-955917-82-7.
Podnázev :
Rok
*
:
2022
Obor
*
:
Počet stran
*
:
14
Strana od
*
:
103
Strana do
*
:
116
Forma vydání
*
:
Elektronická verze
Kód ISBN
*
:
978-1-955917-82-7
Kód ISSN :
Název sborníku
*
:
Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
Sborník :
Mezinárodní
Název nakladatele
*
:
Association for Computational Linguistics
Místo vydání
*
:
Seattle, Washington
Stát vydání :
Sborník vydaný v zahraničí
Název konference :
Místo konání konference
*
:
Seattle, Washington
Datum zahájení konference
*
:
Typ akce podle státní
příslušnosti účastníků akce
*
:
Celosvětová akce
Kód UT WoS :
EID :
2-s2.0-85139172965
Klíčová slova anglicky
*
:
morpheme segmentation, tokenization, Czech, English, Spanish, Hungarian, French, Italian, Russian, Latin, Mongolian
Popis v původním jazyce
*
:
The SIGMORPHON 2022 shared task on morpheme segmentation challenged systems to decompose a word into a sequence of morphemes and covered most types of morphology: compounds, derivations, and inflections. Subtask 1, word-level morpheme segmentation, covered 5 million words in 9 languages (Czech, English, Spanish, Hungarian, French, Italian, Russian, Latin, Mongolian) and received 13 system submissions from 7 teams and the best system averaged 97.29% F1 score across all languages, ranging English (93.84%) to Latin (99.38%). Subtask 2, sentence-level morpheme segmentation, covered 18,735 sentences in 3 languages (Czech, English, Mongolian), received 10 system submissions from 3 teams, and the best systems outperformed all three state-of-the-art subword tokenization methods (BPE, ULM, Morfessor2) by 30.71% absolute. To facilitate error analysis and support any type of future studies, we released all system predictions, the evaluation script, and all gold standard datasets.
Popis v anglickém jazyce
*
:
The SIGMORPHON 2022 shared task on mor- pheme segmentation challenged systems to de- compose a word into a sequence of morphemes and covered most types of morphology: com- pounds, derivations, and inflections. Subtask 1, word-level morpheme segmentation, covered 5 million words in 9 languages (Czech, English, Spanish, Hungarian, French, Italian, Russian, Latin, Mongolian) and received 13 system sub- missions from 7 teams and the best system av- eraged 97.29% F1 score across all languages, ranging English (93.84%) to Latin (99.38%). Subtask 2, sentence-level morpheme segmenta- tion, covered 18,735 sentences in 3 languages (Czech, English, Mongolian), received 10 sys- tem submissions from 3 teams, and the best sys- tems outperformed all three state-of-the-art sub- word tokenization methods (BPE, ULM, Mor- fessor2) by 30.71% absolute. To facilitate error analysis and support any type of future studies, we released all system predictions, the evalua- tion script, and all gold standard datasets.
Typ zdroje financování výsledku
*
:
Jiné veřejné zdroje
Seznam projektů :
ID Projektu
Název projektu
Seznam ohlasů :
Ohlas
R01:
RIV/61988987:17250/22:A2302FXH
Complementary Content
${title}
${badge}
${loading}