| Annotation in original language: |
The contribution focuses on the cultural perception of the names of the 10 greatest Czech rivers (Labe, Vltava, Morava, Odra, Dyje, Ohře, Berounka, Svratka, Otava, and Sázava) from the perspective of corpus linguistics. It studies the ways these names can be clustered on the grounds of their collocates, thereby constructing a symbolic profile of each river based on its linguistic behaviour. To this end, the contribution introduces – for the first time in Czech linguistics – three types of post-hoc collocate grouping, which may be seen as extensions of the standard quantitative collocation extraction. Specifically, the research makes use of (1) the selective approach, which takes into account the shared collocations only, and two counts of the holistic approach, namely (2) the binary procedure, considering the mere presence/absence of a collocate, and (3) the insertional procedure, which accounts for the non-shared collocates by postulating that their collocation association value is 0. The analysed collocations comprise units of primary morphology (parts of speech), secondary morphology (cases), lexicology (lemmas), and syntax (clause members). Since we suppose that all the parts of speech and all the cases will participate as collocates of the river names, in case of these markers, we employ the (1) procedure only. Altogether, 8 collocation analyses are thus conducted. The investigation is carried out on the grounds of the SYN2020 corpus, which is a well-balanced sample of three stylistic spheres of contemporary Czech. The resulting dendrograms show clusters of river names that reflect their symbolic roles in language and culture, providing insights for the study of hydronyms, literary texts, and cultural discourse, and opening new avenues for advanced collocation analysis.
|