The AI platform built for humanities research.

Transkribus gives you a complete pipeline from document image to structured, searchable text — with model training, layout analysis, entity tagging, and TEI-XML export. No coding required. Born from EU-funded research, governed by a cooperative of 250+ institutions, and used at 40+ universities.

Start for free See how it works

Transkribus editor — transcription and annotation interface

40+universities using Transkribus

300+community-trained AI models

TEI-XMLexport for scholarly editions

AI text recognition on historical document

Machine-readable text from any script

Handwritten text recognition (HTR) converts document images into editable, searchable text. 300+ public models cover scripts from medieval Latin to 20th-century Kurrent. Train your own model on 50 pages of ground truth if nothing fits.

Structured data extraction from historical documents

Structured data, not just raw text

Table recognition, field extraction, and entity tagging turn unstructured documents into structured datasets. Extract names, dates, places, and relationships — ready for databases, spreadsheets, or computational analysis.

Published and citable editions

Export as TEI-XML for scholarly editions, or publish directly as a searchable Transkribus Site. Your transcriptions become a citable, accessible research output — not just a working file on your laptop.

For rigorous research

Reproducible, versioned, documented

Every model in Transkribus is versioned. Your training data is preserved. Accuracy is measured with Character Error Rate (CER) on held-out test sets. This means your transcription workflow is reproducible, auditable, and ready for peer review — the same standards you apply to the rest of your research methodology.

Versioned models with documented training data and accuracy metrics

Character Error Rate (CER) evaluation on held-out test sets

Full export of ground truth, model parameters, and recognition results

Cite the exact model version used in your publications

CER explained

Model accuracy evaluation with CER metrics

No coding required

Train custom AI models in a visual interface

You don't need to write code, manage servers, or understand neural network architectures. Prepare your training data in the built-in editor, click train, and Transkribus builds a model optimised for your specific documents. The same deep learning technology used in computational research — accessible to any humanities scholar.

Visual ground truth editor — transcribe and correct in context

Start training with as few as 50 transcribed pages

Models improve as you add more ground truth

Share models with collaborators or the whole community

Learn about model training

Custom model training interface in Transkribus

How to Include HTR in Your Grant Proposal

Sample methodology text, CER benchmarks, and data management guidance for DFG, ERC, NEH, AHRC, and other funders.

Methodology

Read guide

Character Error Rate (CER) Explained

The standard accuracy metric for HTR — how it's calculated, what benchmarks to expect, and how to report it.

Reference

Learn more

Browse Public Models

300+ community-trained models for scripts from medieval Latin to 20th-century Kurrent. Find a starting point for your documents.

Models

See models

Start your research project with Transkribus

Start for free with 50 credits per month — enough to process hundreds of pages. For larger projects, talk to our team about institutional plans and research partnerships.

Start for free Book a consultation

300+public AI models

40+universities

EU-hostedGDPR-compliant