achim.rabus · PyLaia · Published October 18, 2025
Generic Bulgarian Handwriting 1
Text Recognition
Description
This first version of a generic model for handwritten modern Bulgarian (predominantly mid- and late 20th and 21st centuries) was trained at the Slavic Department in Freiburg im Breisgau (PI: Achim Rabus, researcher: Franz Beimesche). We used synthetic data (text formatted with handwriting fonts) to complement our handwritten Ground Truth data. Handwritten texts from the 1960s to the present have been provided by several organizations and individuals, such as the Bulgarian School Munich “Paisii Hilendarski”, the Bulgarian School in Berlin, the University of Sofia “St. Kliment Ohridski”, Sigrun Comati and Irina Guteva. A small number of Bulgarian texts which represent an earlier version of Bulgarian orthography were used. The content of the provided text data differed strongly. Lecture notes, letters from former students to their professors and personal notes were used to train the Bulgarian model alongside everyday language. The Bulgarian School in Berlin kindly provided its annual commemorative publication, which contained several handwritten texts by the school's students. Many thanks to all who supported the creation of this model!
Try this model
Drag an image here
Select a file...PNG or JPG up to 10 Mb
Wolpi
AI Assistant
By uploading an image, you accept our terms and privacy policy.
Use this modelOpen in Transkribus
Low error rate5.55% CER
Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 5.55% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.
Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.
Words75,838
Lines8,348
Training Pages298
Model ID417809