Authors: Bumbu Tudor,
Burţeva Liudmila,
Cojocaru Svetlana,
Colesnicov Alexandru,
Malahov Ludmila
Keywords: historical fonts, OCR, PostOCR.
Abstract
Processing texts from distant historical periods, especially those handwritten in languages with low computational resources, presents significant challenges. Even if modern methods make it possible to achieve, after laborious machine learning procedures, a fairly good rate of correct character recognition, the problem of the correctness of the resulting editable text remains a topical one. This paper presents an approach that contributes to the automation of the PostOCR proofreading process based on the presentation of digitized text using historical fonts, similar to those in the original document.
Tudor Bumbu 1,2, Lyudmila Burtseva1,3,
Svetlana Cojocaru1,4,
Alexandru Colesnicov1,5, Ludmila Malahov1,6
1 Moldova State University, "V. Andrunachievici" Institute of Mathematics and
Computer Science, Chisinau, Republic of Moldova
2ORCID: https://orcid.org/0000-0001-5311-4464
E-mail:
3ORCID: https://orcid.org/0000-0002-9064-2538
E-mail:
4ORCID: https://orcid.org/0009-0003-1025-5306
E-mail:
5ORCID: https://orcid.org/0000-0002-4383-3753
E-mail:
6ORCID: https://orcid.org/0000-0001-9846-0299
E-mail:
DOI
https://doi.org/10.56415/csjm.v33.12
Fulltext

–
3.50 Mb