Latin Court Hand: KB27/795 (1460)

Description

A specialized Handwritten Text Recognition (HTR) model was developed using Pylaia in Transkribus to improve access to challenging plea rolls (CP40, KB27) from The National Archives, utilizing AALT website images provided by Robert Palmer, Elspeth Rosbrook, and Susanne Brand. Initially focused on KB27/795, the model tackles dense, abbreviated Court Hand script. An innovative iterative strategy involved HTR processing, followed by refinement using an LLM (Anthropic's Claude 3.7 Sonnet) guided by paleographic rules and Vance Mead's index. Uncertain lines, identified by high Character Error Rate (CER) from multiple LLM transcriptions, were tagged "unclear." Crucially, these "unclear" lines—often due to manuscript damage or difficult script—were excluded from the ground truth used to retrain Pylaia. This created a "clean" training set focused on high-confidence transcriptions, improving the model's accuracy on clearer text and achieving ~5% CER on the target roll. The transcription philosophy emphasizes manuscript fidelity: non-expansion of abbreviations, strict line integrity, and precise letterforms/capitalization. While trained on clean data from KB27/795, the model offers high accuracy there and is expected to perform well on similar rolls with graceful degradation. It provides visually faithful, non-expanded transcriptions, enhancing access to these vital historical records, especially their clearer sections.

Try this model

Drag an image here

Select a file...

PNG or JPG up to 10 Mb

Wolpi

AI Assistant

By uploading an image, you accept our terms and privacy policy.

Use this model Open in Transkribus

Low error rate5.25% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 5.25% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words181,441

Lines11,177

Training Pages422

Model ID336333

Related models

Medieval Protocolbook 's-Hertogenbosch by Townclerck Petrus de Os sr., 1497-1542

Geerturi van Synghel (Huygens ING)·PyLaia·3y ago

LatinDutch

Description

Try this model

Related models

Transkribus Czech Handwriting M1

Javanese And Latin

UCL–University of Toronto #7

Medieval Protocolbook 's-Hertogenbosch by Townclerck Petrus de Os sr., 1497-1542