olof.karsvall · PyLaia · Published November 5, 2022

Jaemtlands domsaga 1.0

Text Recognition

Description

The model “Jaemtlands_domsagasM1+” is trained on ca. 493 455 words from court books from Jämtland county in Sweden – Jämtlands läns domsaga, from the years 1647-1688. The books are the original ones written by different local writers on location (not the copies that were written later and sent in to the royal court in Stockholm – “renoverade domböcker”). The texts are written in Swedish. The transcripts that are used are not 100% true to the original spelling. Some abbreviations are spelled out (for example r:dr = riksdaler) there are also a few remarks made in the transcripts in brackets. The CER is 5.30%.

Try this model

Use this modelOpen in Transkribus
Low error rate5.3% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 5.3% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material. This is a larger model trained on diverse material, which generally makes it more robust across different handwriting styles. That said, larger training sets also make it harder to push the CER down further.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words493,455
Lines72,578
Training Pages1,173
Model ID45915
Languages
Swedish