Publ3 search

Z6_60GI02O0O8IDC0QEJUJ26TJDI4

Error:

Javascript is disabled in this browser. This page requires Javascript. Modify your browser's settings to allow Javascript to execute. See your browser's documentation for specific instructions.

Publikační činnost

Probíhá načítání, čekejte prosím...

Record type:	stať ve sborníku (D)
Home Department:	Katedra českého jazyka (25300)
Title:	Syntactic units and their length distributions: A case study in Czech
Citace	Nogolová, M., Koščová, M., Mačutek, J. a Čech, R. Syntactic units and their length distributions: A case study in Czech. In: Third Workshop on Quantitative Syntax (QUASY, SyntaxFest 2025): Proceedings of the Third Workshop on Quantitative Syntax (QUASY, SyntaxFest 2025) 2025-08-29 Ljubljana. Kerrville: Association for Computational Linguistics, 2025. s. 115-123. ISBN 979-8-89176-293-0.

Subtitle
Publication year:	2025
Obor:
Number of pages:	9
Page from:	115
Page to:	123
Form of publication:	Elektronická verze
ISBN code:	979-8-89176-293-0
ISSN code:
Proceedings title:	Proceedings of the Third Workshop on Quantitative Syntax (QUASY, SyntaxFest 2025)
Proceedings:	Mezinárodní
Publisher name:	Association for Computational Linguistics
Place of publishing:	Kerrville
Country of Publication:	Sborník vydaný v zahraničí
Název konference:	Third Workshop on Quantitative Syntax (QUASY, SyntaxFest 2025)
Conference venue:	Ljubljana
Datum zahájení konference:
Typ akce podle státní příslušnosti účastníků:	Celosvětová akce
WoS code:
EID:

Key words in English:	Syntactic units; length distributions; dependency syntax
Annotation in original language:	This study investigates the length distributions of syntactic units in Czech across multiple hierarchical levels: sentences, independent clauses, clauses, phrases, subphrases, and chunks. Using a diverse dataset – including Universal Dependency treebanks, presidential speeches, the Czech Bible, and random sample from corpora of modern Czech – the analysis examines whether lengths of these syntactic units follow consistent distributional patterns. Length is defined as the number of immediate subunits, and the distributions were modeled using the hyper-Poisson distribution. The results demonstrate that the hyper-Poisson model fits well distributions of length of all abovementioned syntactic units, pointing to a common principle underlying the organization of syntactic structure in Czech.
Annotation in english language:	This study investigates the length distributions of syntactic units in Czech across multiple hierarchical levels: sentences, independent clauses, clauses, phrases, subphrases, and chunks. Using a diverse dataset – including Universal Dependency treebanks, presidential speeches, the Czech Bible, and random sample from corpora of modern Czech – the analysis examines whether lengths of these syntactic units follow consistent distributional patterns. Length is defined as the number of immediate subunits, and the distributions were modeled using the hyper-Poisson distribution. The results demonstrate that the hyper-Poisson model fits well distributions of length of all abovementioned syntactic units, pointing to a common principle underlying the organization of syntactic structure in Czech.

References

Reference

R01:	RIV/61988987:17250/25:A2603D5A

Complementary Content

${loading}