Nazar Kotsur · PyLaia · Published February 24, 2024

Ukrainian Wikisource

Text Recognition

Description

This is a general-purpose model created with the intention to be used for proofreading books of various orthographies on Ukrainian Wikisource. It was trained on various books and articles from 19th to 21st centuries. Most of the material was downloaded from Wikisource, but a few were taken from other sources. This model should work well with the most popular orthographies of Ukrainian language. The training and book transcriptions were done by Nazar Kotsur, a student of Ivan Franko National University of Lviv.

Try this model

Ukrainian Wikisource
Use this modelOpen in Transkribus
Very low error rate0.9% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 0.9% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words89,442
Lines12,155
Training Pages341
Model ID60074
Languages
Ukrainian
Centuries
19th c.20th c.21st c.