+ Transkribus HTR competing in OCR-test of the Zurich University

+ Transkribus HTR competing in OCR-test of the Zurich University

Members of the Zurich University compared two versions of the ABBYY FineReader (FineReader XIX and FineReader Server 11) OCR (Optical Character Recognition) and the Transkribus HTR (Handwritten Text Recognition) in order to find out which one is the most effective one when it comes to recognition results on black letters in historical newspapers. For the test they used PDFs with medium resolution images of the German-language Neue Zürcher Zeitung.

The recognition of black letters in historical newspapers can be particularly challenging because the distinctiveness of characters is often low, the paper quality can be bad and, in many cases, small font sizes are used. Systems like ABBYY FineReader and Transkribus are working on tackling such problems. We are happy that the experiment of the University of Zurich shows that Transkribus provides significantly better results than the commercial system ABBYY FineReader.

The article explains the effectiveness of the HTR, as only a modest amount of manual work is needed for the creation of ground truth, which makes it possible to apply the HTR on documents. Especially with printed texts in newspapers, error rates in Transkribus are usually low. Moreover, the test shows that the model, which had been trained for the Neue Zürcher Zeitung, also provided good results for other newspapers of the same epoch, like the Bundesblatt and the Neue Zuger Zeitung. Good news is, that the model of the Neue Zürcher Zeitung will become public during 2019.

If you would like to have a closer look on the experiment, you can find the whole article here: https://dev.clariah.nl/files/dh2019/boa/0694.html

Source: https://dev.clariah.nl/files/dh2019/boa/0694.html

Related Articles

+ Printed vs. handwritten text lines - automatically separated

+ Printed vs. handwritten text lines - automatically separated

The Transkribus team collaborates with the Pattern Recognition team of the University Erlangen-Nürnberg (also member of READ-COOP SCE) and the collegues were so great to make an interesting...

+ Paper on Transkribus and handwritten text recognition (HTR) in archives now open access

+ Paper on Transkribus and handwritten text recognition (HTR) in archives now open access

A general paper about Transkribus was published in the Journal of Documentation. Transforming scholarship in the archives through handwritten text recognition gives an overview of the current use of...

+ Digitisation blog of the University Archive Greifswald

+ Digitisation blog of the University Archive Greifswald

Dr. Dirk Alvermann of the University Archive Greifswald is one of the pioneers of Transkribus. He already started working with the first version of Transkribus in 2015. Now, he received a grant from...