Skip to content
  • Pricing
Metagrapho API

Handwriting recognition API for developers

Integrate AI-powered text recognition into your application. REST API with Python, JavaScript, and cURL support. Process handwritten and printed documents at scale.

Used by archives, libraries, and research institutions worldwide

transcribe.py
import requests

TOKEN = "your-bearer-token"
API = "https://transkribus.eu/processing/v2/processes"

# Start a transcription job
resp = requests.post(API,
    headers={"Authorization": f"Bearer {TOKEN}"},
    json={
        "config": {"modelId": 38230},
        "image": {
            "imageUrl": "https://your-archive.org/scan.jpg"
        }
    }
)
job = resp.json()
print(f"Job started: {job['processId']}")

Integrate in four steps

From API key to structured text output in minutes.

01

Authenticate

import requests

API_KEY = "your_api_key"
session = requests.Session()
session.headers["Authorization"] = f"Bearer {API_KEY}"

Get your API key from the Transkribus dashboard and initialize the client.

02

Upload

with open("document.pdf", "rb") as f:
    resp = session.post(
        "https://transkribus.eu/api/v2/uploads",
        files={"file": f}
    )
upload_id = resp.json()["uploadId"]

Upload scanned documents as PDF, JPEG, PNG, or TIFF. Batch upload supported.

03

Transcribe

resp = session.post(
    "https://transkribus.eu/api/v2/jobs",
    json={"docId": upload_id, "modelId": 46003}
)
job_id = resp.json()["jobId"]

Choose a recognition model and start processing. Monitor progress via webhooks or polling.

04

Export

resp = session.get(
    f"https://transkribus.eu/api/v2/jobs/{job_id}"
)
pages = resp.json()["result"]["pages"]

Download results as PAGE XML, ALTO XML, plain text, PDF, or TEI.

API reference

Full REST API with client libraries for Python, Node.js, and direct HTTP access.

POST/v2/uploads
Upload a document image or PDF for processing. Supports multipart file upload.

Parameters

filebinaryrequired
Document file (PDF, JPEG, PNG, TIFF)
collection_idinteger
Target collection ID
<span class="code-keyword">import</span> requests

response = requests.post(
    <span class="code-string">"https://transkribus.eu/api/v2/uploads"</span>,
    headers={<span class="code-string">"Authorization"</span>: <span class="code-string">"Bearer sk_..."</span>},
    files={<span class="code-string">"file"</span>: <span class="code-keyword">open</span>(<span class="code-string">"document.pdf"</span>, <span class="code-string">"rb"</span>)}
)
Response
{
  <span class="code-string">"id"</span>: <span class="code-number">12345</span>,
  <span class="code-string">"status"</span>: <span class="code-string">"uploaded"</span>,
  <span class="code-string">"pages"</span>: <span class="code-number">3</span>,
  <span class="code-string">"created_at"</span>: <span class="code-string">"2024-01-15T10:30:00Z"</span>
}

What developers build with the API

From batch processing pipelines to intelligent search — see how teams integrate Transkribus.

Batch processing pipelines

Process thousands of document pages automatically. Upload archives, trigger recognition, and collect structured output — all via script.

PythonRESTWebhooks
for doc in archive:
    resp = session.post(API + "/uploads", files={"file": doc})
    uid = resp.json()["uploadId"]
    session.post(API + "/jobs", json={"docId": uid})
View batch processing guide

Full-text search indexing

Make handwritten archives searchable. Transcribe documents and feed the output into Elasticsearch, Solr, or your custom search index.

RESTJSONElasticsearch
resp = session.get(f"{API}/jobs/{job_id}")
text = resp.json()["result"]["text"]
es.index(index="archives", body={
    "content": text,
    "source": doc_meta
})

Structured data extraction

Extract tables, fields, and named entities from historical documents. Feed structured data into databases or spreadsheets.

PythonPAGE XMLField Models
resp = session.post(API + "/jobs",
    json={"docId": uid, "modelId": FIELD_MODEL})
result = session.get(f"{API}/jobs/{resp.json()['jobId']}")
for field in result.json()["result"]["fields"]:
    db.insert(field["name"], field["value"])

Custom ML pipelines

Train custom recognition models for specialized material. Integrate model training and evaluation into your ML workflow.

PythonPyLaiaGround Truth
resp = session.post(API + "/models/train",
    json={"name": "Colonial Spanish 1600",
          "gtCollectionId": gt_id,
          "baseModelId": BASE_MODEL_ID})
print(resp.json()["modelId"])

How we compare

Metagrapho vs. other HTR/OCR APIs

General-purpose OCR APIs are built for printed text. Metagrapho is purpose-built for handwriting recognition, including historical scripts that other services cannot read.

FeatureMetagraphoGoogle / AWS / Azure
Modern handwriting recognition Yes Limited
Historical documents (pre-1900) Yes No
Custom model training Yes Limited
300+ specialised HTR models Yes No
EU-hosted processing Yes Partial
GDPR-compliant by default Yes Partial
Credit-based pricing (no per-call fees) Yes No

Comparison based on publicly available documentation as of 2025. Google Cloud Vision, AWS Textract, and Azure AI Document Intelligence offer general OCR with some handwriting support but no specialised HTR models or historical document capabilities. AWS and Azure offer limited custom training for printed forms. All three offer EU region options with additional configuration.

EUAT

Enterprise-grade infrastructure. European hosting.

Transkribus is built and operated by READ-COOP SCE, a European cooperative. Your data stays under your control.

EU-hosted processing

All data processed on servers in Austria. No third-party cloud dependencies. Your documents never leave the EU.

GDPR-compliant by design

Full data ownership. Delete documents and results at any time. Data processing agreements available for organisations.

Cooperative ownership

Owned by 250+ archives, libraries, and universities. Built for long-term reliability and the research community, not a VC exit.

Start building with the Metagrapho API

Get your API credentials and start processing documents today. Organisation plans available for production workloads with dedicated throughput and support.

50 free credits per month. No credit card required.

200M+Pages processed
2,000+Institutions
300+AI models