j.nockels · PyLaia · Published October 24, 2025

Ghostwriter

Text Recognition

Description

Trained on 'The Spiritualist Newspaper' (1869), with images from the National Library of Scotland's (NLS) Data Foundry (https://data.nls.uk/data/digitised-collections/spiritualist-newspapers/) and binarised in R. This ground-truth forms a main benchmark for assessing CER/WER across Automatic Text Recognition models (open source to commercial), as part of the NLS funded 'Recognising Text, Recognising Processes - eXplainable Automatic Text Recognition for Scottish Spiritualistic Newspapers'. Image credit: 'Seeing is Believing: Spiritualism in the Victorian Era', the Old Operating Theatre, https://oldoperatingtheatre.com/seeing-is-believing-spiritualism-in-the-victorian-era-part-2/

Try this model

Ghostwriter
Use this modelOpen in Transkribus
Very low error rate0.89% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 0.89% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words158,825
Lines17,787
Training Pages45
Model ID421381
Languages
English
Centuries
19th c.