Gabelsberger_natural

Description

The model is based on various manuscripts written in Gabelsberger shorthand. The training data includes a part of the diaries and supplementary sheets of Michael Cardinal von Faulhaber (https://www.faulhaber-edition.de/index.html), a part of the minutes of the Council of Ministers from 1900(mrp (oeaw.ac.at)), some war diaries of Carl Schmitt (Arbeitsgruppe Carl Schmitt / Carl Schmitt Tagebücher · GitLab) and some ego documents from the private estates of various people. The model was trained by Milanka Matić-Chalkitis as part of the MultiHTR project (project leader: Prof. Dr. Achim Rabus) at the Department of Slavic Languages and Literatures of the University of Freiburg (Germany). We would like to thank Dr. Philipp Gahn and Dr. Michael Pilarski (Institute of Contemporary History, Munich-Berlin), Dr. Stephan Kurz (Austrian Academy of Sciences) and Prof. Dr. Florian Meinl (University of Göttingen) and his team for kindly providing the GT data and for their close cooperation. The model is intended to offer assistance to those who have little or no expertise in Gabelsberger shorthand, but who wish to explore the context of their documents themselves. It should be noted that, although this model is based on a variety of different training data, the automatic transcription of individual manuscripts varies in quality. We recommend comparing the transcription results of the model with and without the language model.

Try this model

Drag an image here

Select a file...

PNG or JPG up to 10 Mb

Wolpi

AI Assistant

By uploading an image, you accept our terms and privacy policy.

Use this model Open in Transkribus

Moderate error rate13.38% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 13.38% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material. This is a larger model trained on diverse material, which generally makes it more robust across different handwriting styles. That said, larger training sets also make it harder to push the CER down further.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words429,663

Lines30,691

Training Pages1,281

Model ID119053

Related models

Description

Try this model

Related models

The Text Titan I ter

The Text Titan I (Super Model)

Dutch Dean (Super Model)

Dansk Dokumentalist (Super Model)