OU Portal
Log In
Welcome
Applicants
Z6_60GI02O0O8IDC0QEJUJ26TJDI4
Error:
Javascript is disabled in this browser. This page requires Javascript. Modify your browser's settings to allow Javascript to execute. See your browser's documentation for specific instructions.
{}
Close
Publikační činnost
Probíhá načítání, čekejte prosím...
publicationId :
tempRecordId :
actionDispatchIndex :
navigationBranch :
pageMode :
tabSelected :
isRivValid :
Record type:
stať ve sborníku (D)
Home Department:
Katedra českého jazyka (25300)
Title:
Syntactic units and their length distributions: A case study in Czech
Citace
Nogolová, M., Koščová, M., Mačutek, J. a Čech, R. Syntactic units and their length distributions: A case study in Czech.
In:
Third Workshop on Quantitative Syntax (QUASY, SyntaxFest 2025): Proceedings of the Third Workshop on Quantitative Syntax (QUASY, SyntaxFest 2025) 2025-08-29 Ljubljana.
Kerrville: Association for Computational Linguistics, 2025. s. 115-123. ISBN 979-8-89176-293-0.
Subtitle
Publication year:
2025
Obor:
Number of pages:
9
Page from:
115
Page to:
123
Form of publication:
Elektronická verze
ISBN code:
979-8-89176-293-0
ISSN code:
Proceedings title:
Proceedings of the Third Workshop on Quantitative Syntax (QUASY, SyntaxFest 2025)
Proceedings:
Mezinárodní
Publisher name:
Association for Computational Linguistics
Place of publishing:
Kerrville
Country of Publication:
Sborník vydaný v zahraničí
Název konference:
Third Workshop on Quantitative Syntax (QUASY, SyntaxFest 2025)
Conference venue:
Ljubljana
Datum zahájení konference:
Typ akce podle státní
příslušnosti účastníků:
Celosvětová akce
WoS code:
EID:
Key words in English:
Syntactic units; length distributions; dependency syntax
Annotation in original language:
This study investigates the length distributions of syntactic units in Czech across multiple hierarchical levels: sentences, independent clauses, clauses, phrases, subphrases, and chunks. Using a diverse dataset – including Universal Dependency treebanks, presidential speeches, the Czech Bible, and random sample from corpora of modern Czech – the analysis examines whether lengths of these syntactic units follow consistent distributional patterns. Length is defined as the number of immediate subunits, and the distributions were modeled using the hyper-Poisson distribution. The results demonstrate that the hyper-Poisson model fits well distributions of length of all abovementioned syntactic units, pointing to a common principle underlying the organization of syntactic structure in Czech.
Annotation in english language:
This study investigates the length distributions of syntactic units in Czech across multiple hierarchical levels: sentences, independent clauses, clauses, phrases, subphrases, and chunks. Using a diverse dataset – including Universal Dependency treebanks, presidential speeches, the Czech Bible, and random sample from corpora of modern Czech – the analysis examines whether lengths of these syntactic units follow consistent distributional patterns. Length is defined as the number of immediate subunits, and the distributions were modeled using the hyper-Poisson distribution. The results demonstrate that the hyper-Poisson model fits well distributions of length of all abovementioned syntactic units, pointing to a common principle underlying the organization of syntactic structure in Czech.
References
Reference
R01:
Complementary Content
Deferred Modules
${title}
${badge}
${loading}
Deferred Modules