Danish Newspapers 1800-1900

Name: Danish Newspapers 1800-1900
Author: Johan Heinsen (Aalborg University) and Max Odsbjerg Pedersen (Royal Danish Library)

Description

This model is trained on historical newspapers from the Danish newspaper collection at the Royal Danish Library (Det Kongelige Bibliotek), spanning from 1800 to 1900. It is trained to handle highly variable newspaper formats, ranging from small A4 pages with two columns to large-format A2 broadsheets with up to six columns. To achieve optimal results and ensure the correct reading order, it is recommended to apply column-wise region sorting after segmentation. The model was trained by Johan Heinsen (Aalborg University) and Max Odsbjerg Pedersen (Royal Danish Library) with the assistance of Kamilla Matthiassen and Helle Nedergaard Thorup.

Open in Transkribus

High precision89.28% MaP

Mean Average Precision (MaP) measures how accurately the model detects field regions (higher is better). This model scored 89.28% on its validation set. MaP is harder to compare across models than CER, because the score depends heavily on how many distinct region types the model must distinguish. A model detecting a handful of simple fields will naturally score higher than one trained to recognise many fine-grained regions, even if both perform well in practice.

This score reflects performance on the model's own validation data. Your results will depend on how closely your documents match the training material and the complexity of the structures you need to detect.

Words149,913

Lines40,423

Training Pages1,467

Model ID420801

Related models

Description

Related models

Field-model, 1700-tallets supplikprotokoller, supplik og svar

20th Century Typewritten Letters - Diplomatics' Elements

Basic Book Fields II

Page Layout of printed books (around 1800)