yngvil.beyer · PyLaia · Published June 23, 2024

SamiskOCR_v1.4

Text Recognition

Description

Model trained on Northern Sami, Southern Sami, Inari Sami and Lule Sami Printed text, in addition to some Norwegian text. Transkribus Print M1 was used as base model.

Try this model

Use this modelOpen in Transkribus
Very low error rate0.18% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 0.18% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words47,033
Lines8,389
Training Pages144
Model ID115833
Languages
NorwegianSouthern SamiNorthern SamiSami LanguagesLule SamiInari Sami
Centuries
1st c.2nd c.3rd c.4th c.5th c.6th c.7th c.8th c.9th c.10th c.11th c.12th c.13th c.14th c.15th c.16th c.17th c.18th c.19th c.20th c.21st c.