Skip to content
  • Pricing
Tobias Hodel · PyLaia · Published December 21, 2022

StAZH_RRB_German_Kurrent_XIX

Text Recognition

Description

Complete training of all minutes of the handwritten Zurich executive minutes (1803-1887), based on automated text-to-image processing. The set is licensed under CC-BY-SA and can be re-used. For access to the minutes see here: https://www.archives-quickaccess.ch/search/stazh/rrb For the TEI-XML see ZENODO: https://doi.org/10.5281/zenodo.803239

Try this model

Drag an image here

Select a file...

PNG or JPG up to 10 Mb

Wolpi
AI Assistant

By uploading an image, you accept our terms and privacy policy.

Use this modelOpen in Transkribus
Very low error rate1.2% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 1.2% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material. This is a larger model trained on diverse material, which generally makes it more robust across different handwriting styles. That said, larger training sets also make it harder to push the CER down further.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words26,026,908
Lines5,909,205
Training Pages159,062
Model ID48925
Languages
German