nicoleta.hegedus · PyLaia · Published October 26, 2025

Transylvanian Parish Registers Hungarian 19th Century

Text Recognition

Description

This handwritten text recognition model is based on Calvinist, Unitarian, and Lutheran parish registers from Transylvania, dating from the second half of the 19th century. These books follow a specific tabular format and contain records of the main life events of community members. The model is trained on several writing styles and performs relatively well on these types of sources, which include many repetitive words. It has been developed within the PCE grant “From Parish Registers to Digital Infrastructures”, financed by UEFISCDI (grant no. 40/03.01.2025), for the period January 2025 – December 2027. The model is intended to be improved periodically.

Try this model

Transylvanian Parish Registers Hungarian 19th Century
Use this modelOpen in Transkribus
Very low error rate2.04% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 2.04% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material. This is a larger model trained on diverse material, which generally makes it more robust across different handwriting styles. That said, larger training sets also make it harder to push the CER down further.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words118,246
Lines66,336
Training Pages502
Model ID422781
Languages
Hungarian
Centuries
1st c.2nd c.3rd c.4th c.5th c.6th c.7th c.8th c.9th c.10th c.11th c.12th c.13th c.14th c.15th c.16th c.17th c.18th c.19th c.20th c.21st c.