OU Portal
Log In
Welcome
Applicants
Z6_60GI02O0O8IDC0QEJUJ26TJDI4
Error:
Javascript is disabled in this browser. This page requires Javascript. Modify your browser's settings to allow Javascript to execute. See your browser's documentation for specific instructions.
{}
Close
Publikační činnost
Probíhá načítání, čekejte prosím...
publicationId :
tempRecordId :
actionDispatchIndex :
navigationBranch :
pageMode :
tabSelected :
isRivValid :
Record type:
stať ve sborníku (D)
Home Department:
Ústav pro výzkum a aplikace fuzzy modelování (94410)
Title:
Acquiring custom OCR system with minimalmanual annotation
Citace
Adamczyk, D., Hůla, J., MOJŽÍŠEK, D. a Čech, R. Acquiring custom OCR system with minimalmanual annotation.
In:
3rd IEEE International Conference on Data Stream Mining and Processing, DSMP 2020: Proceedings of the 2020 IEEE 3rd International Conference on Data Stream Mining and Processing 2020-08-21 Kijev, Ukrajina.
IEEE, 2020. s. 231-236. ISBN 978-172813214-3.
Subtitle
Publication year:
2020
Obor:
Informatika
Number of pages:
6
Page from:
231
Page to:
236
Form of publication:
Elektronická verze
ISBN code:
978-172813214-3
ISSN code:
Proceedings title:
Proceedings of the 2020 IEEE 3rd International Conference on Data Stream Mining and Processing
Proceedings:
Mezinárodní
Publisher name:
IEEE
Place of publishing:
neuvedeno
Country of Publication:
Název konference:
3rd IEEE International Conference on Data Stream Mining and Processing, DSMP 2020
Conference venue:
Kijev, Ukrajina
Datum zahájení konference:
Typ akce podle státní
příslušnosti účastníků:
Celosvětová akce
WoS code:
EID:
2-s2.0-85093687591
Key words in English:
Historical Texts; Neural Networks; OCR; Synthetic Data
Annotation in original language:
We describe a development of a custom OCR system, which is designed specifically for a linguistic analysis of texts printed during the early modern period. This analysis requires precise detection of individual graphemes, and we, therefore, could not apply standard approaches that transcribe whole lines in an end-to-end fashion. We also describe our use of synthetically generated images, which allow us to avoid manual annotation of a large training set.
Annotation in english language:
We describe a development of a custom OCR system, which is designed specifically for a linguistic analysis of texts printed during the early modern period. This analysis requires precise detection of individual graphemes, and we, therefore, could not apply standard approaches that transcribe whole lines in an end-to-end fashion. We also describe our use of synthetically generated images, which allow us to avoid manual annotation of a large training set.
References
Reference
R01:
RIV/61988987:17610/20:A210268A
Complementary Content
Deferred Modules
${title}
${badge}
${loading}
Deferred Modules