aleksej.tikhonov · PyLaia · Published August 8, 2024

Russian generic handwritten and typed 1

Text Recognition

Description

The extension of the Russian generic handwriting 2 model. Curated and trained by Aleksej Tikhonov (MultiHTR project, University of Freiburg) with data the Foundation of the International Memorial Association (with the participation of Aren Vanyan and Nikita Lomakin) as well as data from the model Russian print of the 18 c. (V. Okorokov’s Printing House) by Kira Kovalenko (European University at St. Petersburg). The model can transcribe Russian manuscripts and typewritten texts from the 18th-20th centuries.

Try this model

Use this modelOpen in Transkribus
Low error rate5.54% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 5.54% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material. This is a larger model trained on diverse material, which generally makes it more robust across different handwriting styles. That said, larger training sets also make it harder to push the CER down further.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words622,132
Lines175,348
Training Pages5,271
Model ID148545
Languages
Russian