Aarhus City Archives · PyLaia · Published September 27, 2020

Danish 1870-1950 v3.5 PyLaia

Text Recognition

Description

General Model for Danish handwriting by various hands from c. 1870-1950. The same data as Danish 1870-1950 v3.5 but using the PyLaia engine instead of HTR+. Created by Kristian Pindstrup & Jan Mattias Jonsson Agger at Aarhus City Archives. Using available material from The Royal Danish Library as well as material from the work of the volunteers at Aarhus City Archive, Faxe Archive, Næstved Archive and Gentofte Archive who has transscribed various city and parish council minutes. More info in Danish on the project at http://retrodigitalisering.dk

Try this model

Danish 1870-1950 v3.5 PyLaia
Use this modelOpen in Transkribus
Very low error rate4.7% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 4.7% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material. This is a larger model trained on diverse material, which generally makes it more robust across different handwriting styles. That said, larger training sets also make it harder to push the CER down further.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words1,603,622
Lines372,704
Training Pages8,010
Model ID26311
Languages
Danish
Centuries
19th c.20th c.