aneta.yotova · PyLaia · Published July 27, 2025

AUSBUL_model

Text Recognition

Description

used as training data set Cyrillic handwritten manuscripts: CIAI 503 (Kostenec Damaskin), second half of the 17th century, ff. 255r.-258r.; NBKM 1418 (Elenski Damaskin), 17th century, ff. 312r.-335v.; NBKM 433 (Panagyurište Miscellany), 16th century, ff. 125r.-135v.; NBIV 105 (Plovdiv Miscellany), 15th century, ff.10r.-19r.; NBKM 326 (Adžar Miscellany), 17th century, ff.89r.-95r.; Slav. Fol. 36 (Berlin Damaskin), 18th century, ff.327r.-339r.

Try this model

Use this modelOpen in Transkribus
Low error rate8.31% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 8.31% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words20,076
Lines2,329
Training Pages93
Model ID378005
Languages
Church Slavic
Centuries
14th c.15th c.16th c.17th c.18th c.