Viennese Property Registers 1420-1517

Description

The model is based on the property registers of the city of Vienna from the 15th and early 16th centuries. These are part of the city books. All real estate transactions in which a property changed hands, for example through a purchase or inheritance, were listed in them. The entries follow a form that varies only slightly, which is why the vocabulary represented in the training material is limited. The entries were written in Early New High German with a few Latin phrases. The fonts used are late Gothic minuscule, Bastarda and a very early Kurrent. The training material consists of 1228264 words, which corresponds to approximately 3500 pages. The Ground Truth was created as part of the DFG-funded research project Mapping Medieval Vienna, which focuses on analyzing the content of the sources. The transcription guidelines are therefore aimed at simplifying readability. Abbreviations have been resolved and medieval punctuation has been omitted. The letters are always transcribed in their basic form, diacritics have not been taken into account, and no distinction has been made between long and round "s". The following abbreviations were used for currency symbols: tl. = pound, s. = shilling, d. = pfenning, fl. = florin. Due to the homogeneity of the source corpus, the model achieves a 1.50% CER on a validation set. Contact: j.helmchen@fu-berlin.de

Try this model

Drag an image here

Select a file...

PNG or JPG up to 10 Mb

Wolpi

AI Assistant

By uploading an image, you accept our terms and privacy policy.

Use this model Open in Transkribus

Very low error rate1.5% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 1.5% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material. This is a larger model trained on diverse material, which generally makes it more robust across different handwriting styles. That said, larger training sets also make it harder to push the CER down further.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words1,228,264

Lines127,905

Training Pages3,300

Model ID57815

Related models

Description

Try this model

Related models

Text Titan II

The Text Titan I ter

The Text Titan I (Super Model)

German Genius (Super Model)