The Polish Schwabacher

Name: The Polish Schwabacher
Author: j.halaczkiewicz@uj.edu.pl

Description

This text recognition model has been developed on the basis of scans of Jakub Śliwski’s Polish translation of „Historia del regno di Voxu del Giapone, dell, antichita, nobilta, e valore del suo re Idate Masamune” by Scipione Amati, published by Franciszek Cezary in Cracow, 1616. The story tells of the second Japanese mission to Europe, which took place after the successful establishment of Christian faith in Japan by the Franciscan missionary Luis Sotelo (1574–1624). Scans of source book are available at the National Digital Library Polona: https://polona.pl/preview/45b00d44-4957-41c8-af0e-6e9ccae557ae. The source material was printed mainly in the Polish Schwabacher, a Gothic font used by typesetters for typesetting texts in the national language (see more: https://typoteka.pl/en). There are also italics (used to highlight quoted fragments) and Latin font (for Latin words). This text recognition model helps in preparing diplomatic transcription. All characters (including ſ, á, v in the „u” function, y in the „i, j” function, / as a comma) have been preserved. Abbreviations and ligatures (like sweg°, teg°, æ, &) were expanded. The model does not recognize initials. It may not recognize headlines correctly. Contributors: Dr Joanna Hałaczkiewicz (Faculty of Polish Studies, Jagiellonian University, j.halaczkiewicz@uj.edu.pl) – editor and supervisor; Karolina Kapuścińska, Agata Lech, Gabriela Paszkowska, Aleksandra Sobańska, Agnieszka Tkacz, Olga Zatońska – a master’s students of Polish philology with emphasis in textual scholarship.

Try this model

Drag an image here

Select a file...

PNG or JPG up to 10 Mb

Wolpi

AI Assistant

By uploading an image, you accept our terms and privacy policy.

Use this model Open in Transkribus

Very low error rate0.87% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 0.87% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words16,407

Lines1,860

Training Pages56

Model ID101013

Related models

Description

Try this model

Related models

Transkribus Polish M2

The Text Titan I ter

The Text Titan I (Super Model)

The German Giant I