Submit thousands of jobs. We handle the rest.

The Transkribus API manages your processing queue intelligently. Submit documents one at a time or thousands in parallel — jobs are distributed across GPU clusters, processed asynchronously, and results delivered via long polling or standard polling. From a prototype integration to millions of archival pages.

Book a consultation See the API

200M+pages processed on the platform

15M+pages in a single project

300+AI models for any script

Traditional pipeline vs. Transkribus

Document processing at scale used to mean managing people and queues manually. Transkribus handles that infrastructure for you.

Traditional approach

Hire transcribers

Recruit, train, and manage a team of skilled readers

Process sequentially

Each page transcribed by hand, one at a time

Quality review

Second reader checks every page for errors

Format and export

Manual conversion to the required output format

Linear — scales with headcount

Transkribus batch processing

Submit jobs

Upload via web app or submit thousands of jobs via API

Intelligent queue

Jobs are distributed across GPU clusters automatically

Get results

Long polling for instant results, or poll async for batch jobs

Export

Plain text, PAGE XML, ALTO, TEI — structured output

Parallel — scales with infrastructure

Intelligent queue management

How the processing pipeline works

The Transkribus API is async by design. Submit jobs at any rate — the queue distributes them across available GPU capacity. For real-time integrations, use long polling to get results as soon as they're ready. Not satisfied with accuracy? Train a custom model on your specific documents using the visual editor, then reprocess the entire batch.

Submit

POST images via API — URL, base64, or file upload

Queue

Intelligent job distribution across GPU clusters

Process

Layout analysis + text recognition in parallel

Result

Long polling or async polling — your choice

Export

Plain text, PAGE XML, ALTO, or JSON

Case study

Zeitpunkt.NRW: 15 million newspaper pages in a single project

The state of North Rhine-Westphalia used Transkribus to process 15 million historical newspaper pages — the largest single digitization project on the platform. The collection spans over a century of regional newspapers, now fully searchable and accessible to the public at zeitpunkt.nrw.

15 million pages processed with AI text recognition

Historical Fraktur and blackletter print handled automatically

Publicly accessible and full-text searchable

Read about the project

Zeitpunkt.NRW — 15M newspaper pages processed

Plain text

Simple UTF-8 text output. Feed into search indexes, databases, or NLP pipelines.

PAGE XML

Full layout coordinates — regions, lines, words, baselines. The standard for HTR workflows.

ALTO XML

Library-standard format for digitized collections. Compatible with Europeana, DFG Viewer, and IIIF.

TEI XML

Text Encoding Initiative format for scholarly editions and digital humanities projects.

Table data

Structured table recognition — rows, columns, and cell content extracted automatically.

Full-text search

Processed documents are instantly searchable within Transkribus — names, dates, places, keywords.

Ready to process your collection?

Start with a free account to test on a sample. For large-scale projects, talk to our team about volume pricing and project support.

Start for free Book a consultation

200M+pages processed

Volumepricing available

EU-hostedGDPR-compliant

Submit thousands of jobs. We handle the rest.

Traditional pipeline vs. Transkribus

Traditional approach

Hire transcribers

Process sequentially

Quality review

Format and export

Transkribus batch processing

Submit jobs

Intelligent queue

Get results

Export

How the processing pipeline works

Submit

Queue

Process

Result

Export

Zeitpunkt.NRW: 15 million newspaper pages in a single project

Structured output, not just flat text

Plain text

PAGE XML

ALTO XML

TEI XML

Table data

Full-text search

Ready to process your collection?