Skip to content
  • Preise

Was ist Handschrifterkennung (HTR)?

Handschrifterkennung nutzt Deep Learning, um handschriftliche Dokumente in maschinenlesbaren Text umzuwandeln. Im Gegensatz zu Standard-OCR wird HTR auf echten Handschriftproben trainiert und kann historische Schriften, Kursivschrift und verbundene Buchstabenformen entziffern.

HTR erklärtSo funktioniert esHTR vs. OCRAnwendungen

Probieren Sie es aus — laden Sie ein handschriftliches Dokument hoch und erleben Sie HTR in Aktion. Genutzt an 500+ Universitäten weltweit.

500.000+
Nutzer weltweit
200 Mio.+
Verarbeitete Seiten
300+
Öffentliche HTR-Modelle
100+
Sprachen und Schriften

Das Problem

Why Standard OCR Fails on Handwriting

Optical Character Recognition (OCR) was designed for printed text — uniform typefaces with consistent letter spacing and predictable layouts. When applied to handwritten documents, standard OCR produces unusable results. Handwriting is inherently variable: letterforms differ between writers, characters connect in unpredictable ways, and historical scripts like Kurrent, Sütterlin, or Secretary Hand bear little resemblance to modern print. This is the core problem that Handwritten Text Recognition technology was developed to solve.
Standard OCR engines expect uniform character shapes — handwriting varies between every writer and even within a single page
Connected and cursive scripts cannot be segmented into individual characters the way printed text can
Historical scripts (Kurrent, Secretary Hand, Copperplate) use letterforms absent from modern OCR training sets
Abbreviations, ligatures, and superscript conventions in historical manuscripts have no equivalent in print
Document degradation — faded ink, bleed-through, foxing — compounds the challenge beyond what rule-based systems handle
Comparison: OCR output versus HTR output on a handwritten document

Die Lösung

How Does HTR Work? AI Handwriting Recognition Explained

Handwritten Text Recognition uses deep neural networks — typically a combination of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) — to learn the visual patterns of handwriting directly from labelled examples. Rather than relying on predefined rules about what letters look like, an HTR model is trained on thousands of images paired with their correct transcriptions (called "ground truth"). Through this training, the model learns to recognise not just individual characters, but sequences of connected strokes, contextual letter shapes, and the spatial relationships between text elements on a page.
Layout analysis detects text regions, lines, and structural elements (columns, tables, marginalia) on the page
Line segmentation isolates individual text lines from the detected layout
The neural network processes each line image and predicts a sequence of characters, considering context from surrounding strokes
Language modelling and post-processing refine the output, resolving ambiguous characters using statistical patterns
Confidence scores are assigned to each predicted character and line, enabling targeted quality review
Diagram: how HTR processes a handwritten document from image to text

HTR vs OCR

Handwritten Text Recognition vs. Optical Character Recognition

HTR and OCR are related technologies but address fundamentally different challenges. Understanding the distinction is important when evaluating tools for historical document processing.

FeatureHTR (Handwritten Text Recognition)Standard OCR
Designed forHandwritten and cursive textPrinted and typewritten text
Character segmentationNot required — processes connected strokes as sequencesRequires isolating individual characters
Historical scriptsKurrent, Secretary Hand, Copperplate, and 100+ moreLimited or no support
Training approachDeep learning on labelled handwriting samples (ground truth)Rule-based pattern matching or print-trained models
AdaptabilityCustom models can be trained for specific hands or scriptsGenerally fixed — cannot adapt to new handwriting styles
Accuracy on handwritingTypically 90–98% character accuracy with trained modelsOften below 50% on cursive or historical handwriting
Layout analysisHandles complex layouts: columns, tables, marginaliaBasic — assumes simple left-to-right text flow
Connected scriptsYes — Arabic, Hebrew, cursive Latin scriptsLimited or unsupported
Degraded documentsRobust — trained on real historical documents with damagePerformance degrades significantly
Confidence scoringPer-character and per-line confidence scoresVaries — often absent or unreliable

Comparison reflects general capabilities of HTR systems (including Transkribus) versus standard OCR engines. Specific results depend on document type, model selection, and document condition.

Coverage

What Scripts, Languages, and Centuries Does HTR Support?

Modern HTR platforms — Transkribus in particular — support a remarkably broad range of scripts, languages, and historical periods. The key is the availability of trained models. Because HTR models learn from examples rather than rules, any script for which sufficient training data exists can be supported. Transkribus offers over 300 public models contributed by researchers and institutions worldwide, covering documents from the 9th century to the present day.
Latin scripts: modern and historical variants including Kurrent, Sütterlin, Secretary Hand, Copperplate, Humanistic, and Gothic cursive
Non-Latin scripts: Arabic, Hebrew, Greek, Cyrillic, Devanagari, Chinese, Japanese, and more — with models available or trainable
100+ languages represented in the public model catalogue, from German and English to Finnish, Hungarian, and Ottoman Turkish
Time span from medieval manuscripts (9th century) through early modern administrative records to 20th-century correspondence
Mixed-script documents: models can handle pages containing multiple scripts (e.g., Latin headings with Kurrent body text)
Examples of handwriting scripts supported by HTR: Kurrent, Arabic, Secretary Hand, and more

Used at leading research institutions worldwide

Who uses HTR

Handwritten Text Recognition Technology in Practice

HTR has moved beyond the experimental stage. It is now a production tool used across the humanities, cultural heritage, and information science. Researchers use it to build searchable corpora from manuscript collections. Archives use it to process backlogs of undigitised holdings. Libraries use it to make special collections discoverable. The technology is particularly transformative in contexts where the volume of handwritten material makes manual transcription economically impossible.
Digital humanities researchers transcribing correspondence, diaries, and literary manuscripts for scholarly editions
National and municipal archives processing administrative records, court documents, and civic registers at scale
Libraries and special collections making finding aids and catalogue records searchable and discoverable
Genealogists reading parish registers, census returns, and civil records in historical scripts
Cultural heritage projects digitising endangered manuscript collections before physical deterioration

Beyond recognition

The Full Pipeline: From Handwritten Document to Structured Data

Handwritten text recognition is one step in a larger document processing pipeline. A complete workflow begins with digitisation (scanning or photography), proceeds through layout analysis and text recognition, and continues into post-processing: entity recognition, metadata extraction, structured export, and publication. Transkribus integrates all of these stages into a single platform, so researchers do not need to stitch together separate tools for each step.
Layout analysis: automatic detection of text regions, columns, tables, headings, and marginalia
Text recognition: HTR converts detected text lines into machine-readable characters
Custom model training: fine-tune models on your specific manuscript type for higher accuracy
Entity recognition and tagging: identify persons, places, dates, and other named entities in the transcribed text
Export as TEI-XML, PAGE XML, ALTO XML, searchable PDF, or plain text — ready for analysis, publication, or archival ingest
The complete document processing pipeline: from scan to structured data

Frequently Asked Questions about Handwritten Text Recognition

EUAT

Built for research. Hosted in Europe. Open and cooperative.

Transkribus is developed and hosted by READ-COOP SCE, a European cooperative of 250+ institutions. Your data stays yours.

Ihre Daten bleiben bei Ihnen

Volles Eigentum. Jederzeit löschbar.

Gehostet in Österreich, EU

Verarbeitung auf unseren eigenen Servern. DSGVO-konform. Keine Cloud-Abhängigkeiten.

Genossenschaft, kein Startup

Tausende Archive, Bibliotheken und Universitäten als Miteigentümer. Gebaut für Jahrzehnte, nicht für einen VC-Exit.

Bereit, Handschrifterkennung auszuprobieren?

Erstellen Sie ein kostenloses Konto und verarbeiten Sie Ihre ersten Dokumente mit Transkribus. 50 kostenlose Credits pro Monat — keine Kreditkarte nötig.

Genutzt an 500+ Universitäten und Forschungseinrichtungen

200 Mio.+Verarbeitete Seiten
500.000+Nutzer weltweit
300+Öffentliche HTR-Modelle