Skip to content
  • Pricing
Success StoryUniversitiesEnglishLibrary19th century18th century

Sustainable efficiency: How the University of Georgia transcribed 20,000 pages in two months

Sustainable efficiency: How the University of Georgia transcribed 20,000 pages in two months

The Finding Their Names: Discovery and Description of Enslavement Events project is a major initiative by the Hargrett Rare Book and Manuscript Library at the University of Georgia. Funded by a $137,000 grant from the National Historical Publications and Records Commission (NHPRC), the project aims to identify and document enslaved persons who lived during the Colonial and Antebellum periods of the United States by studying the content of historical documents most often created by the people who enslaved them.

The project team was tasked with analysing over 20,000 pages of archival documents sourced from over 80 collections, including correspondence, diaries, ledgers, and legal records. By creating high-resolution scans, full-text transcripts, and applicable datasets, and then publishing them online, the team was able to elevate these documents from collections of unscanned documents into searchable, machine-readable, and future-ready digital resources.

  

Match Page 2 to Page 1 (1)

Will Stanier, Librarian and Project Coordinator, University of Georgia Libraries 

Chris Lott, Digitization and Data Coordinator, UGA Special Collections Libraries

  

Key Facts

Organisation: Hargrett Rare Book and Manuscript Library at the University of Georgia (supported by a grant from the NHPRC).

Material: Over 20,000 pages of Colonial and Antebellum era archival documents.

Goal: To identify and document enslaved persons by creating searchable, machine-readable digital resources and publishing them to the Digital Library of Georgia (DLG) and Enslaved.org.

Project constraints: A small team of two people, a strict two-month timeline before Transkribus credits expired, and the need to meet specific technical standards/metadata requirements for external digital repositories.

Result: Successfully generated transcripts and metadata for 20,000+ pages in under two months, establishing a sustainable and efficient AI-assisted workflow for future projects.

IMG_0870The University of Georgia Libraries are home to thousands of documents relating to the slave trade. © University of Georgia Libraries

A challenging collection with technical requirements

The primary obstacle for the team was the nature of the collections. While some of the documents had already been digitised, they were not consistent in descriptions and metadata, making it difficult for researchers and library users to search across documents to understand narratives from mentions in different sources. To ensure the widest possible impact, the team planned to publish the digitised documents on both the Digital Library of Georgia (DLG) and the Enslaved.org digital archive, but this came with its own set of requirements.

“Documents needed to comply with various standards. First, the professional standards of library and archives generally. More specifically, the metadata standards for details like file-names that are inherent to both our library and the Digital Library of Georgia. And then also to the standards suggested by the Controlled Vocabulary published by Enslaved.org.” - Will Stanier,  Librarian and Project Coordinator, University of Georgia Libraries 

Operational constraints added another layer of difficulty. The library already had a Transkribus subscription, but the credits were soon to expire, and the project was managed by a small team of two. They required a methodology that could achieve results fast without compromising on technical requirements or transcription accuracy.

 

Screenshot 2026-05-19 115314
The Digital Library of Georgia contains resources from across all the libraries at the University of Georgia. © University of Georgia Libraries

  

A hybrid approach of AI and custom code

The Hargrett Library team developed a hybrid approach of Transkribus plus custom code to achieve optimal efficiency with very little resources. The Transkribus interface allows for extensive formatting and tagging of documents, with an array of options for exporting any metadata created. This flexibility allowed the team to write a custom script that facilitated the bulk uploading, recognition, and downloading of pages.

The script automatically compiled the data into text files that mirrored the original physical objects and complied with existing DLG and Enslaved.org standards. This hybrid approach of AI and custom code made the work very efficient.

“[Before creating this scripting workflow], we were looking at manually uploading material at the level at which we would eventually want to compile the transcripts (for each folder of material or, more commonly, for each item), or manually compiling after recognition, either of which would have required significant time investments that can now be directed toward quality control and subsequent phases of metadata creation.” - Will Stanier

Screenshot 2026-05-20 111158By combining Transkribus with custom code, the team of two were able to create a custom workflow that met their individual transcription requirements. © Transkribus 

 

More documents transcribed in less time

Increased productive capacity: Combining Transkribus and custom code allowed the team to generate full-text transcripts for over 20,000 pages of historical documents in less than two months. “The possibility of generating full-text transcripts for over 20,000 pages of 18th and 19th century handwritten documents in under two months, with two project staff, an AI tool, and a few lines of code presents exciting opportunities for special collections libraries.” - Chris Lott, Digitization and Data Coordinator, UGA Special Collections Libraries

Workflow efficiency: The workflow was so successful that the team were able to exceed their expectations by transcribing thousands of pages more than originally planned. “The workflow created the productive capacity to transcribe an additional 9,000 scans from non-grant digitization projects.” - Chris Lott

Enhanced discovery and research: The team was not only able to fulfil their digitisation goals on this project, but they also laid the foundations for future digitisation projects. “While we did not codify a specific approach to using Transkribus applicable to every future project, we did establish a sustainable model for recognizing documents quickly that will allow us much flexibility in how we approach these kinds of projects.” - Chris Lott

Conclusion

The Finding Their Names project demonstrates how special collections libraries can use AI to overcome traditional digitisation bottlenecks. By combining Transkribus with tailored automation, the Hargrett Rare Book and Manuscript Library at the University of Georgia has not only increased the speed of its digitisation efforts and set the groundwork for future projects, but has also ensured that the stories of enslaved individuals are more accessible and searchable than ever before.

Digitising written sources with specific technical requirements is a challenge experienced by many archives and institutions. If your team wants to find out more about how Transkribus can be used to create custom digitisation workflows for your project, then reach out to one of our advisors today and discover how Transkribus could help you reach your digitisation goals.

Related Articles

How the 'Material Culture of Wills' project transcribed 25,000 wills with Transkribus
Success StoryResearchUniversities+517th centuryArchivesEnglish18th century16th century

How the 'Material Culture of Wills' project transcribed 25,000 wills with Transkribus

What did people in early modern England think about their possessions? Which objects held the most significance in their lives? It is these questions and more that the Material Culture of Wills:...

Creating a digital scholarly edition of the Lovelace papers with Jessica Cook
Success StoryResearchEnglish+4USALibrary19th centuryUK

Creating a digital scholarly edition of the Lovelace papers with Jessica Cook

Mathematics, computing, music, poetry: Ada Lovelace was a person of many talents. The 19th-century mathematician — and daughter of infamous Romantic poet, Lord Byron — is widely regarded as one of...

Creating the Swedish Lion Ⅰ text recognition model
Success Story17th centuryArchives+5Text recognition models19th century18th centurySwedishSweden

Creating the Swedish Lion Ⅰ text recognition model

Simplicity, citizen engagement and AI-driven transcription were key factors that intrigued Olof Karsvall and the team at the Swedish National Archives when they discovered Transkribus.