brangheorghe7 · PyLaia · Published June 25, 2024

19th-century Romanian Transitional Script - GT corrected

Text Recognition

Description

This model was trained on a dataset of 19th-century Romanian documents obtained from the Central University Libraries (BCU) of Timișoara, Iași, and Cluj-Napoca, Romania. The training dataset comprises 153 pages of Romanian texts written in the Romanian Transitional Script (RTS). The RTS script is a combination of Latin and Cyrillic characters that were employed during the 19th century in the Romanian provinces. Its purpose was to facilitate the transition from the Romanian Cyrillic Script to the modern Latin Script. The images within the dataset span the period between 1833 and 1864, providing a comprehensive representation of the linguistic and typographic variations during that time. The selected texts encompass a diverse range of literary genres, including poems, novels, dramas, stories, newspapers, and religious texts. For more details about the project, visit our website: https://transitional-romanian-transliteration.azurewebsites.net/ The dataset is available to download from Kaggle: https://www.kaggle.com/datasets/mariuspenteliuc/rts-ocr This work was supported by a grant of the Romanian Ministry of Research, Innovation and Digitization, CCCDI – UEFISCDI, project number PN-III-P2-2.1-PED-2021-0693, within PNCDI III.

Try this model

Use this modelOpen in Transkribus
Very low error rate1.14% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 1.14% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words31,221
Lines4,349
Training Pages153
Model ID117113
Languages
Moldavian
Centuries
1st c.2nd c.3rd c.4th c.5th c.6th c.7th c.8th c.9th c.10th c.11th c.12th c.13th c.14th c.15th c.16th c.17th c.18th c.19th c.20th c.21st c.