Historisk Datalaboratorium Aalborg University · PyLaia · Published September 29, 2024

Seventeenth Century Danish Newspapers

Text Recognition

Description

Created by Louise Karoline Sort, Laura Marie Ahrensbach, Julie Edelsten, Alexander Simon Kjølby Carlsen, Andreas Winkler Bønnelykke, Adrian Ledaal Gundersen, Jonathan Eskerod Qvistorff Kanstrup, Sarah Lydia Blok Kloster, Mads Meldgaard Skibsted Kristensen, Magnus Østergaard Larsen, Hans Niklas Holmgaard Pedersen, Louise Emilie Pedersen, Lea Ruess, Maja Thorsø Rønn, Ditte Nørgaard Schrøder, Oliver Thomsen, Stinna Victoria Østergaard and Johan Heinsen. The model was trained on 248 pages of the late seventeenth century Danish newspaper Extraordinaire Maanedlige Relationer. It does well on running text, but tends to struggle with marginalia. For material with German mixed in we recommend the NorFrak model.

Try this model

Use this modelOpen in Transkribus
Very low error rate0.58% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 0.58% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words86,063
Lines15,447
Training Pages248
Model ID186489
Languages
Danish