Skip to content
  • Preise

Archiv-Rückstau mit KI-gestützter Texterkennung abbauen

Millionen unbearbeiteter Seiten, zu wenig Personal. Transkribus verarbeitet ganze Bestände im Batch — macht verborgene Sammlungen durchsuchbar und auffindbar auf institutioneller Ebene.

Batch-VerarbeitungVerborgene SammlungenKI im großen MaßstabKontaktieren Sie uns

Vertrauen von 500.000+ Nutzern weltweit — 200 Mio.+ Seiten verarbeitet

2.000+
Archive und Bibliotheken
200 Mio.+
Verarbeitete Seiten
300+
Öffentliche KI-Modelle
250+
Genossenschaftsmitglieder

Das Problem

The Hidden Collections Crisis: Archive Digitization Backlogs Keep Growing

OCLC estimates that more than 30% of archival collections in the United States alone remain "hidden" — unprocessed, uncatalogued, and effectively invisible to researchers. The situation is comparable across Europe and beyond. These are not marginal materials. They include correspondence, legal records, administrative files, and manuscripts that researchers cannot discover because no finding aid, catalogue entry, or searchable text exists for them. Every year the backlog grows as new acquisitions arrive faster than understaffed teams can process them.
Staff shortages are structural, not temporary — archives cannot hire their way out of the backlog
Manual transcription of a single archival box can take weeks of skilled labour
Unprocessed collections generate no citations, no research, and no public engagement
Grant-funded digitisation projects often cover imaging but not text recognition or metadata creation
Mixed collections — typescript, handwriting, printed forms — require different approaches that slow manual workflows further
Unprocessed archival boxes awaiting cataloguing and digitisation

Die Lösung

Reduce Archival Backlog with AI: From Unprocessed Boxes to Searchable Records

Transkribus enables archives to process collections at a scale that manual workflows cannot achieve. Upload scanned images — entire boxes, series, or fonds — and run AI text recognition across thousands of pages in a single batch. The platform's handwritten text recognition (HTR) handles the scripts and document types most common in archival holdings: administrative handwriting, official correspondence, court records, municipal registers, and mixed-format files. The result is machine-readable, searchable text that can be exported directly into archival information systems.
Batch processing: queue thousands of pages and process them unattended — no page-by-page intervention
300+ public AI models trained on historical scripts from the 15th century onward
Export to PAGE XML, ALTO XML, and TEI-XML for ingest into ArchivesSpace, AtoM, and other systems
Metagrapho API enables fully automated pipelines for mass digitisation workflows
Publish processed collections directly as searchable digital editions via Transkribus Sites
Transkribus batch processing interface for large-scale archival collections

How to process an archival collection in 4 steps

Upload scanned collections

Upload entire series or fonds as multi-page PDFs, TIFFs, or image batches. Transkribus handles layout detection — columns, tables, marginalia — automatically.

KI-Modell auswählen

Choose from 300+ public models filtered by language, century, and script type. For mixed collections, run multiple models on different document groups within the same project.

Run batch recognition

Queue thousands of pages for processing. Transkribus runs text recognition in the background — no manual intervention required. Monitor progress from the dashboard.

Export and integrate

Export results as PAGE XML, ALTO XML, TEI-XML, plain text, or searchable PDF. Ingest directly into ArchivesSpace, AtoM, or publish via Transkribus Sites.

At scale

Automated Archival Processing with the Metagrapho API

For institutions running large-scale or recurring digitisation programmes, the Metagrapho REST API enables fully automated processing pipelines. Integrate text recognition directly into your existing imaging and cataloguing workflows — no manual uploads, no browser-based interaction. The API supports model selection, batch job management, and structured output retrieval, making it suitable for production-grade mass digitisation projects.
REST API with full documentation for integration into institutional workflows
Programmatic model selection — choose different models for different collection types automatically
Structured JSON output with text, coordinates, and confidence scores for each text region
Batch job management: submit, monitor, and retrieve results for thousands of pages
Combine with entity recognition to extract names, dates, and places for catalogue enrichment
Metagrapho API documentation for automated archival processing

Häufig gestellte Fragen

EUAT

Institutional-grade infrastructure for archival collections.

Transkribus is built and hosted in Europe by a cooperative of 250+ archives, libraries, and universities. Your collections stay under your control.

Ihre Daten bleiben bei Ihnen

Volles Eigentum. Jederzeit löschbar.

Gehostet in Österreich, EU

Verarbeitung auf unseren eigenen Servern. DSGVO-konform. Keine Cloud-Abhängigkeiten.

Genossenschaft, kein Startup

Tausende Archive, Bibliotheken und Universitäten als Miteigentümer. Gebaut für Jahrzehnte, nicht für einen VC-Exit.

Bereit, Ihren Archiv-Rückstau anzugehen?

Sprechen Sie mit unserem Team über institutionelle Pläne für die Verarbeitung großer Bestände, oder erstellen Sie ein kostenloses Konto zur Evaluierung.

Genutzt von über 2.000 Archiven und Bibliotheken weltweit

200 Mio.+Verarbeitete Seiten
2.000+Archive und Bibliotheken
300+Öffentliche KI-Modelle