Amsterdam City Archive · Baselines · Published May 10, 2023

Notaries Baseline Model

Layout Analysis

Description

This baseline recognition model is a specialised component developed for a project with the Amsterdam City Archives to make the notarial archive of the 16th-17th century accessible. It's been meticulously trained on a diverse dataset comprising various types of historical documents sourced from the notarial archives. These documents include notarial records, council protocols, and other manuscripts rich in marginalia and notes. This baseline model is recommended for anyone looking to recognise notarial manuscripts, council protocols, or documents containing marginalia and ancillary notes.
Open in Transkribus
Very low loss4% loss

Loss indicates how far the predicted text regions deviate from the ground truth (lower is better). This model achieved 4% on its validation set. A loss below 10% generally indicates reliable baseline detection. Trained on a broad range of page layouts, this model should generalise well. Complex or unusual structures may still require fine-tuning.

Layout detection quality depends heavily on your document's structure. Pages with columns, marginalia, or non-standard layouts may produce different results.

Words186,063
Lines36,378
Training Pages889
Model ID52118