+ Coming soon - new DocScan app to help users digitise historical documents!

+ Coming soon - new DocScan app to help users digitise historical documents!

More and more archival holdings are being digitised.  But there are still thousands of document collections that exist only in manuscript form.  This means that interested readers must visit the archive in person to take pictures of and transcribe the documents they are interested in.

The READ project is seeking to make this process easier with a new digitisation service.  The Computer Vision Lab at Technical University Vienna is developing DocScan, an Open Source Android mobile app that allows archival users to take high-quality images of historical documents.

Screenshot of Transkribus DocScan

DocScan automatically detects the page area of a document and provides real-time feedback on the quality of the image according to factors like perspective, sharpness and light.  This allows users to take high-quality images that can be used for Handwritten Text Recognition in Transkribus, or simply for future research. The DocScan app will be connected to Transkribus so users can upload their images directly to our cloud.

The Computer Vision Lab are also working on a prototype of a ScanTent. This is a piece of equipment designed to hold a mobile phone in a stable position in order to produce a more standardised shot.  This could be particularly handy for scanning bound volumes, where two hands are sometimes needed to keep the pages in place.

DocScan and the ScanTent can also be of use to archives, as they could enable institutions to build up a collection of user-generated content.  QR code recognition or similar technology could be employed to ensure that images are organised correctly within an archive’s digital collections.

If you are interested in finding out more, you can read our reports:

Günter Mühlberger (University of Innsbruck), Markus Diem, Stefan Fiel and  Florian Kleber (all at the Computer Vision Lab, Technical University Vienna), D5.14 ScanREAD.

Günter Mühlberger (University of Innsbruck), Markus Diem, Fabian Hollaus, Stefan Fiel and  Florian Kleber (all at the Computer Vision Lab, Technical University Vienna), D81. Open Innovation Forum.

You can also take a look at the back-end of the DocScan app on our Github page.

We will be partnering with several archives to test out these two products and we plan to organise a ‘scanathon’ to see how quickly users can produce good quality digital images.  Stay tuned to hear more about the development and testing of the app!

Related Articles

+ DATeCH Conference - learn about Handwritten Text Recognition at our workshop

+ DATeCH Conference - learn about Handwritten Text Recognition at our workshop

The DATeCH International Conference is fast approaching on 1-2 June 2017 in Göttingen. The conference is a forum for innovative work on the creation, use and transformation of digitised historical...

+ Machine Reading the Archive in Cambridge

+ Machine Reading the Archive in Cambridge

It was a sunny Tuesday morning when the READ project made it to the Centre for Research in the Arts, Social Sciences and Humanities (CRASSH) at the University of Cambridge for our latest workshop....

+ Meet the READ project partners - Sofia Ares Oliveira

+ Meet the READ project partners - Sofia Ares Oliveira

What’s your name? Sofia Ares Oliveira. Where do you work? Digital Humanities Laboratory at Ecole Polytechnique Fédérale de Lausanne (EPFL). Tell us a bit about your background… I studied Electrical...