+ Keyword Spotting: search handwritten documents with Transkribus!

+ Keyword Spotting: search handwritten documents with Transkribus!

Transkribus can automatically produce transcripts of historical material with very impressive results, where 90-95% of characters in a given transcript are correct.  Just take a look at the slides and videos from our recent Transkribus User Conference to see some of the best outputs generated by our users.

But the potential of Automated Text Recognition is even greater when it comes to keyword searching!  Transkribus now includes Keyword Spotting technology, a sophisticated form of keyword searching based on research by the CITlab team at the University of Rostock (one of the READ project partners).

Keyword Spotting results in Transkribus for the word ‘Annuities’

This form of Keyword Spotting is particularly useful because it can work even when there are errors in the results of Automated Text Recognition.  The technology searches through the probability values assigned to characters and words during the text recognition process.  Once a user enters a search query, the program searches through all possible permutations of each word on the page and returns a range of results, with some more likely to be correct than others.  The users can then check and the results of the search output and decide which results to follow up.

Keyword Spotting is an amazing technological advance, with the potential to open up huge historical collections which have never been previously transcribed.

To use Keyword Spotting in Transkribus, you need to have trained an Automated Text Recognition model to recognise the documents in your collection.  You can find more information about working with Keyword Spotting in our How to Guide:

Related Articles

+ Transkribus recognises early modern German correspondence

+ Transkribus recognises early modern German correspondence

The Gender History research group at the University of Jena (Thuringia, Germany) have been experimenting with Transkribus as part of a digital edition project on the correspondence of the...

Working with Gothic script? Join a new Transkribus working group!

Working with Gothic script? Join a new Transkribus working group!

Gothic scripts from the Middle Ages can be found in archives and libraries all over Europe. The script was widely used for hundreds of years, and not only in expensive decorated books. First...

+ Transcribing Foucault's handwriting with Transkribus

+ Transcribing Foucault's handwriting with Transkribus

In exciting news, Transkribus has started to tackle the papers of the seminal French philosopher Michel Foucault. The team at the Foucault fiches de lecture (Foucault’s Reading Notes) project, have...