+ National Archives releases first version of a Dutch handwriting model

The digitisation team around Liesbeth Keyser from the National Archives in the Netherlands is working hard on creating training data for their collections in order to prepare HTR processing on a large scale. As a first result a model based on 475.769 words is now made available for Transkribus users. The model shows a Character Error Rate of 7.48% on the training set and 6.15% on the validation set. It is based on the careful transcription of dozens of different handwritings and comprises scans from the Incoming Documents from the Dutch East India Company (Overgekomen Brieven en Papieren van de VOC) of the National Archives of the Netherlands and of 19th century Notarial deeds from the Noord-Hollands archief.  The model is named: NAN/NHA_GT_M3+ Enjoy!

 

Related Articles

+ Printed vs. handwritten text lines - automatically separated

+ Printed vs. handwritten text lines - automatically separated

The Transkribus team collaborates with the Pattern Recognition team of the University Erlangen-Nürnberg (also member of READ-COOP SCE) and the collegues were so great to make an interesting...

+ Paper on Transkribus and handwritten text recognition (HTR) in archives now open access

+ Paper on Transkribus and handwritten text recognition (HTR) in archives now open access

A general paper about Transkribus was published in the Journal of Documentation. Transforming scholarship in the archives through handwritten text recognition gives an overview of the current use of...

+ Digitisation blog of the University Archive Greifswald

+ Digitisation blog of the University Archive Greifswald

Dr. Dirk Alvermann of the University Archive Greifswald is one of the pioneers of Transkribus. He already started working with the first version of Transkribus in 2015. Now, he received a grant from...