+ Transkribus recognises early modern German correspondence

+ Transkribus recognises early modern German correspondence

The Gender History research group at the University of Jena (Thuringia, Germany) have been experimenting with Transkribus as part of a digital edition project on the correspondence of the eighteenth-century regent, Erdmuthe Benigna von Reuß-Ebersdorf (1670-1732).

Early Modern scripts are very challenging for Automated Text Recognition technology because letters tend to be closely intertwined, abbreviations occur quite often and the spelling of words is not standardized.  As the below example suggests, Erdmuthe’s writing is not easy to follow!  She had a unique writing style and often broke words into separate parts.

Sample page of a letter (Source: Landesarchiv Thüringen – Staatsarchiv Greiz, Paragiatsherrschaft Köstritz, From IV 15, fol. 56r ., All rights reserved)

In order to train a model to recognise Erdmuthe’s writing, the Gender History research team used about 250 pages of existing transcripts that had been produced in the course of their work on the digital edition.  They also used these same transcripts to create a dictionary of Erdmuthe’s vocabulary that can be integrated into the recognition process.

The resulting model is capable of producing automated transcripts of Erdmuthe’s writing with a Character Error Rate (CER) of below 9%.  When a dictionary is included in the recognition process,  the errors are reduced still further.

Martin Prell from the project team has elaborated on this experiment in a report (in German).  He covers the experience of preparing training data for text recognition and working directly with Transkribus.  If you are thinking about using Transkribus for your own project, this very instructive paper could help!

Report:

Other links:

Related Articles

+ Sharing data with Transkribus - Transcribimus and minutes of Vancouver City Council

+ Sharing data with Transkribus - Transcribimus and minutes of Vancouver City Council

We can all agree that it’s nice to share – and in the READ project, sharing data brings direct benefits for the Handwritten Text Recognition technology in our Transkribus platform. According to...

+ Learn more about Transkribus in Zagreb

+ Learn more about Transkribus in Zagreb

Join us for an event in the Croatian capital of Zagreb on Thursday 18th October. The event is hosted by ICARUS Croatia and the Faculty of Philosophy at the University of Zagreb. There will be a...

+ Join us for Vienna Scanathon at the Austrian Academy of Sciences

+ Join us for Vienna Scanathon at the Austrian Academy of Sciences

Digitising historical documents? There’s an app for that! Join us in Vienna for our next Scanathon event, hosted by the Austrian Academy of Sciences and the Austrian Centre for Digital Humanities....