Skip to content
  • Pricing

The AI platform built for humanities research.

Transkribus gives you a complete pipeline from document image to structured, searchable text — with model training, layout analysis, entity tagging, and TEI-XML export. No coding required. Born from EU-funded research, governed by a cooperative of 250+ institutions, and used at 40+ universities.

Transkribus editor — transcription and annotation interface
40+universities using Transkribus
300+community-trained AI models
TEI-XMLexport for scholarly editions

The research pipeline

From document image to structured research data — what Transkribus gives you at each stage.

AI text recognition on historical document

Machine-readable text from any script

Handwritten text recognition (HTR) converts document images into editable, searchable text. 300+ public models cover scripts from medieval Latin to 20th-century Kurrent. Train your own model on 50 pages of ground truth if nothing fits.

Structured data extraction from historical documents

Structured data, not just raw text

Table recognition, field extraction, and entity tagging turn unstructured documents into structured datasets. Extract names, dates, places, and relationships — ready for databases, spreadsheets, or computational analysis.

Published digital scholarly edition

Published and citable editions

Export as TEI-XML for scholarly editions, or publish directly as a searchable Transkribus Site. Your transcriptions become a citable, accessible research output — not just a working file on your laptop.

For rigorous research

Reproducible, versioned, documented

Every model in Transkribus is versioned. Your training data is preserved. Accuracy is measured with Character Error Rate (CER) on held-out test sets. This means your transcription workflow is reproducible, auditable, and ready for peer review — the same standards you apply to the rest of your research methodology.
Versioned models with documented training data and accuracy metrics
Character Error Rate (CER) evaluation on held-out test sets
Full export of ground truth, model parameters, and recognition results
Cite the exact model version used in your publications
Model accuracy evaluation with CER metrics

No coding required

Train custom AI models in a visual interface

You don't need to write code, manage servers, or understand neural network architectures. Prepare your training data in the built-in editor, click train, and Transkribus builds a model optimised for your specific documents. The same deep learning technology used in computational research — accessible to any humanities scholar.
Visual ground truth editor — transcribe and correct in context
Start training with as few as 50 transcribed pages
Models improve as you add more ground truth
Share models with collaborators or the whole community
Custom model training interface in Transkribus

Resources for DH researchers

Guides, methodology, and tools for integrating Transkribus into your research.

How to Include HTR in Your Grant Proposal

Sample methodology text, CER benchmarks, and data management guidance for DFG, ERC, NEH, AHRC, and other funders.

Methodology

Character Error Rate (CER) Explained

The standard accuracy metric for HTR — how it's calculated, what benchmarks to expect, and how to report it.

Reference

Browse Public Models

300+ community-trained models for scripts from medieval Latin to 20th-century Kurrent. Find a starting point for your documents.

Models

Start your research project with Transkribus

Start for free with 50 credits per month — enough to process hundreds of pages. For larger projects, talk to our team about institutional plans and research partnerships.

300+public AI models
40+universities
EU-hostedGDPR-compliant