+ Transkribus – The Best Idea to Procrastinate I’ve Ever Had
Stefan Karcher, a graduate student at Heidelberg University has written a fascinating blog post explaining how he has been using Transkribus to process nineteenth-century German sermons.
Karcher took the opportunity to train his own Automated Text Recognition models. He used around 30,000 transcribed words of training data to generate a model that can produce transcripts of his documents with a Character Error Rate of 8-10%. The blog post notes that these transcripts are a useful and efficient basis for his research and includes a description of how these automated transcripts can be analysed with Voyant Tools.
Do you want to train your own Automated Text Recognition model?
- Find out how to get started in our How to Guide.
Related Articles

+ Preserving our cultural heritage with a smartphone
The READ project is a big proponent of digitisation on demand using smartphones. A typical mobile phone camera can capture relatively high-quality images of historical documents, which can then be...

+ Searching the Spanish Golden Age with Keyword Spotting
In sixteenth- and seventeenth-century Spain, there was a significant surge of thousands of theatrical productions. This period has become known as the Spanish Golden Age. Thanks to a new protoype web...

+ Recognising eighteenth-century legal records at Middle Temple
The Honourable Society of the Middle Temple is one of four Inns of Court: prestigious professional associations for barristers working in England. The archive and library of Middle Temple holds...