Aarhus City Archives · PyLaia · Published September 11, 2024

19th century Danish gothic handwriting v.1.3

Text Recognition

Description

This model is the product of long-term experimentation. It has been trained on Danish Gothic handwriting from the late 18th century until the transition to modern handwriting in 1875, and the training material consists primarily of parish council minutes from various city and local archives in the Retro Project. Johan Heinsen, Associate Professor of History at Aalborg University, has also contributed a large amount of material. It has an overall error rate of 4.7%, but as the training material is quite narrow, it can still be somewhat imprecise. Alternatively, you can try the ‘19th century Danish gothic handwriting v.1.1’ model, which has a higher error rate of 6.7%, but is also based on more training material and therefore works better in some cases.

Try this model

Use this modelOpen in Transkribus
Low error rate5.33% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 5.33% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material. This is a larger model trained on diverse material, which generally makes it more robust across different handwriting styles. That said, larger training sets also make it harder to push the CER down further.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words557,599
Lines83,756
Training Pages1,153
Model ID172909
Languages
Danish