+ Blog post from The British Library - Handwritten Text Recognition of India Office Records

The British Library, one of the READ project’s Memorandum of Understanding partners, has been working with Transkribus to process records from the India Office.  This collection relates largely to the London-based administration of the East India Company and pre-1947 government of India.

The British Library started to experiment with Transkribus technology back in 2015.  The complex layout of some of the documents and the number of different hands means that this collection represents a challenge for automated processing.  But the latest results show that an Automated Text Recognition model can transcribe pages with a satisfactory Character Error Rate (CER) of 15%.

Alex Hailey, Curator of Modern Archives and Manuscripts explains more about the progress and lessons learned in his blog post on The British Library’s Digital Scholarship blog.

Related Articles

+ Join us at the 2018 Scanathon in London, Zurich and Helsinki!

+ Join us at the 2018 Scanathon in London, Zurich and Helsinki!

The READ project is organising an exciting international Scanathon on Friday 8 June 2018, with parallel events taking place in Finland, Switzerland and the United Kingdom. We invite you to come along...

+ English Cycling diaries recognised by University of Warwick

+ English Cycling diaries recognised by University of Warwick

We’ve got some terrific results to report relating to an interesting collection of documents held at the Modern Records Centre at the University of Warwick. Archivist Elizabeth Wood and her team have...

+ Help us process tables in Transkribus!

+ Help us process tables in Transkribus!

Information laid out in tables often seems very neat to the human eye but computers can struggle to process the tables that appear commonly in historical documents. At READ, we are working hard to...