+ Reading admiral de Ruyter's journal - using existing transcripts to train Automated Text Recognition

+ Reading admiral de Ruyter's journal - using existing transcripts to train Automated Text Recognition

Nicoline van der Sijs is part of a team of researchers working at the Meertens Institute in the Netherlands (one of the READ MOU partners).  The team has trained an Automated Text Recognition model to process the handwriting of Michiel de Ruyter, a Dutch admiral from the seventeenth century.

The model was trained with around 20,000 words of existing transcribed material from de Ruyter’s journals (see below for an example of his tricky handwriting!).  These transcriptions were matched automatically to corresponding digitised images of de Ruyter’s handwriting using Text2Img matching technology developed by the CITlab team at the University of Rostock (one of the READ project partners).

The resulting model is capable of recognising De Ruyter’s handwriting with a Character Error Rate (CER) of around 10%, which is an remarkable result for such a complex hand.

Image from the De Ruyter collection from the National Archives of the Netherlands, NL HaNA 1.10.72 20 0004

Professor van der Sijs and her colleagues are planning to use these transcriptions to compile an online corpus of de Ruyter’s writings for general access and scholarly linguistic analysis.

Researchers at the Meertens Institute are also interested in replicating these exciting results with other collections where existing transcriptions are already available, thanks to the hard work of volunteer transcribers.  The Stichting Vrijwilligersnet Nederlandse Taal (SVNT) is a network of about 100 volunteers who have been transcribing historic Bibles for more than ten years.  Other material transcribed by volunteers includes sailing letters from the seventeenth and eighteenth centuries and seventeenth-century printed newspapers.  The transcriptions that these volunteers have produced can be fed into our cutting-edge technology and used as training data for Automated Text Recognition.

  • Do you have existing transcriptions that you have produced or collected as part of a research project?
  • Send them to us and we can process them and train a model to recognise the writing in your documents!
  • To find out more about working with existing transcripts, consult our How to Guide or contact us.

Related Articles

+ Sharing data with Transkribus - Transcribimus and minutes of Vancouver City Council

+ Sharing data with Transkribus - Transcribimus and minutes of Vancouver City Council

We can all agree that it’s nice to share – and in the READ project, sharing data brings direct benefits for the Handwritten Text Recognition technology in our Transkribus platform. According to...

+ Learn more about Transkribus in Zagreb

+ Learn more about Transkribus in Zagreb

Join us for an event in the Croatian capital of Zagreb on Thursday 18th October. The event is hosted by ICARUS Croatia and the Faculty of Philosophy at the University of Zagreb. There will be a...

+ Join us for Vienna Scanathon at the Austrian Academy of Sciences

+ Join us for Vienna Scanathon at the Austrian Academy of Sciences

Digitising historical documents? There’s an app for that! Join us in Vienna for our next Scanathon event, hosted by the Austrian Academy of Sciences and the Austrian Centre for Digital Humanities....