Projekt Trug&Schein · PyLaia · Published February 13, 2024

T&S_H1_M1

Text Recognition

Description

HTR model for the handwriting of the worker Hilde Nordhoff (pseudonym) (*1920) from rural Saxony, Germany. She wrote in modern German, Latin script, including German umlauts and the special character ß. Some characters have leftovers from Gernan Kurrent, like an overline or curve above the u. German Giant I was used as a base model. The model was trained with letters from 1942 and 1943 by Laura Fahnenbruck and Andrew S. Bergerson (University of Missouri-Kansas City). The training data was transcribed to Ground Truth by a group of volunteers in countless hours in the public history project Trug&Schein: Ein Briefwechsel. Eine kritische Begegnung mit dem Alltag des Zweiten Weltkriegs – Schreib mit! (2011-2022). The large corpus of Nordhoffs letters to her husband (and vice versa) span the years 1938 to 1946 and is published as transcripts on https://alltag-im-krieg.de/startseite.

Try this model

T&S_H1_M1
Use this modelOpen in Transkribus
Low error rate5.1% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 5.1% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material. This is a larger model trained on diverse material, which generally makes it more robust across different handwriting styles. That said, larger training sets also make it harder to push the CER down further.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words135,584
Lines13,729
Training Pages862
Model ID59628
Languages
German
Centuries
20th c.