Sustainable efficiency: How the University of Georgia transcribed 20,000 pages in two months

Fiona Park

Success Story Universities English Library 19th century 18th century

Sustainable efficiency: How the University of Georgia transcribed 20,000 pages in two months

Fiona Park

May 20, 2026·6 min read

Sustainable efficiency: How the University of Georgia transcribed 20,000 pages in two months

The Finding Their Names: Discovery and Description of Enslavement Events project is a major initiative by the Hargrett Rare Book and Manuscript Library at the University of Georgia. Funded by a $137,000 grant from the National Historical Publications and Records Commission (NHPRC), the project aims to identify and document enslaved persons who lived during the Colonial and Antebellum periods of the United States by studying the content of historical documents most often created by the people who enslaved them.

The project team was tasked with analysing over 20,000 pages of archival documents sourced from over 80 collections, including correspondence, diaries, ledgers, and legal records. By creating high-resolution scans, full-text transcripts, and applicable datasets, and then publishing them online, the team was able to elevate these documents from collections of unscanned documents into searchable, machine-readable, and future-ready digital resources.


Will Stanier, Librarian and Project Coordinator, University of Georgia Libraries	Chris Lott, Digitization and Data Coordinator, UGA Special Collections Libraries

Key Facts

Organisation: Hargrett Rare Book and Manuscript Library at the University of Georgia (supported by a grant from the NHPRC).

Material: Over 20,000 pages of Colonial and Antebellum era archival documents.

Goal: To identify and document enslaved persons by creating searchable, machine-readable digital resources and publishing them to the Digital Library of Georgia (DLG) and Enslaved.org.

Project constraints: A small team of two people, a strict two-month timeline before Transkribus credits expired, and the need to meet specific technical standards/metadata requirements for external digital repositories.

Result: Successfully generated transcripts and metadata for 20,000+ pages in under two months, establishing a sustainable and efficient AI-assisted workflow for future projects.

The University of Georgia Libraries are home to thousands of documents relating to the slave trade. © University of Georgia Libraries

A challenging collection with technical requirements

The primary obstacle for the team was the nature of the collections. While some of the documents had already been digitised, they were not consistent in descriptions and metadata, making it difficult for researchers and library users to search across documents to understand narratives from mentions in different sources. To ensure the widest possible impact, the team planned to publish the digitised documents on both the Digital Library of Georgia (DLG) and the Enslaved.org digital archive, but this came with its own set of requirements.

“Documents needed to comply with various standards. First, the professional standards of library and archives generally. More specifically, the metadata standards for details like file-names that are inherent to both our library and the Digital Library of Georgia. And then also to the standards suggested by the Controlled Vocabulary published by Enslaved.org.” - Will Stanier, Librarian and Project Coordinator, University of Georgia Libraries

Operational constraints added another layer of difficulty. The library already had a Transkribus subscription, but the credits were soon to expire, and the project was managed by a small team of two. They required a methodology that could achieve results fast without compromising on technical requirements or transcription accuracy.

Screenshot 2026-05-19 115314
The Digital Library of Georgia contains resources from across all the libraries at the University of Georgia. © University of Georgia Libraries

A hybrid approach of AI and custom code

The Hargrett Library team developed a hybrid approach of Transkribus plus custom code to achieve optimal efficiency with very little resources. The Transkribus interface allows for extensive formatting and tagging of documents, with an array of options for exporting any metadata created. This flexibility allowed the team to write a custom script that facilitated the bulk uploading, recognition, and downloading of pages.

The script automatically compiled the data into text files that mirrored the original physical objects and complied with existing DLG and Enslaved.org standards. This hybrid approach of AI and custom code made the work very efficient.

“[Before creating this scripting workflow], we were looking at manually uploading material at the level at which we would eventually want to compile the transcripts (for each folder of material or, more commonly, for each item), or manually compiling after recognition, either of which would have required significant time investments that can now be directed toward quality control and subsequent phases of metadata creation.” - Will Stanier

Screenshot 2026-05-20 111158 By combining Transkribus with custom code, the team of two were able to create a custom workflow that met their individual transcription requirements. © Transkribus

Conclusion

The Finding Their Names project demonstrates how special collections libraries can use AI to overcome traditional digitisation bottlenecks. By combining Transkribus with tailored automation, the Hargrett Rare Book and Manuscript Library at the University of Georgia has not only increased the speed of its digitisation efforts and set the groundwork for future projects, but has also ensured that the stories of enslaved individuals are more accessible and searchable than ever before.

Digitising written sources with specific technical requirements is a challenge experienced by many archives and institutions. If your team wants to find out more about how Transkribus can be used to create custom digitisation workflows for your project, then reach out to one of our advisors today and discover how Transkribus could help you reach your digitisation goals.

Sustainable efficiency: How the University of Georgia transcribed 20,000 pages in two months

Key Facts

A challenging collection with technical requirements

A hybrid approach of AI and custom code

More documents transcribed in less time

Conclusion

Related Articles

How the 'Material Culture of Wills' project transcribed 25,000 wills with Transkribus

Creating the Swedish Lion Ⅰ text recognition model

Navigating the transcription of the Dutch Prize Papers