+ Transkribus – The Best Idea to Procrastinate I’ve Ever Had

Stefan Karcher, a graduate student at Heidelberg University has written a fascinating blog post explaining how he has been using Transkribus to process nineteenth-century German sermons.

Karcher took the opportunity to train his own Automated Text Recognition models.  He used around 30,000 transcribed words of training data to generate a model that can produce transcripts of his documents with a Character Error Rate of 8-10%.  The blog post notes that these transcripts are a useful and efficient basis for his research and includes a description of how these automated transcripts can be analysed with  Voyant Tools.

Do you want to train your own Automated Text Recognition model?

Related Articles

+ Preserving our cultural heritage with a smartphone

+ Preserving our cultural heritage with a smartphone

The READ project is a big proponent of digitisation on demand using smartphones. A typical mobile phone camera can capture relatively high-quality images of historical documents, which can then be...

+ Searching the Spanish Golden Age with Keyword Spotting

+ Searching the Spanish Golden Age with Keyword Spotting

In sixteenth- and seventeenth-century Spain, there was a significant surge of thousands of theatrical productions. This period has become known as the Spanish Golden Age. Thanks to a new protoype web...

+ Recognising eighteenth-century legal records at Middle Temple

+ Recognising eighteenth-century legal records at Middle Temple

The Honourable Society of the Middle Temple is one of four Inns of Court: prestigious professional associations for barristers working in England. The archive and library of Middle Temple holds...