luciafwx · PyLaia · Published September 27, 2022

General Portuguese M1

Text Recognition

Description

First attempt to create a general Portuguese model. The ground truth consists of handwritten and printed documents. Some documents are demaged. This project is a collaboration between two different projects.

Try this model

General Portuguese M1
Use this modelOpen in Transkribus
Very low error rate3.8% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 3.8% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words64,842
Lines10,097
Training Pages241
Model ID44949
Languages
Portuguese
Centuries
1st c.2nd c.3rd c.4th c.5th c.6th c.7th c.8th c.9th c.10th c.11th c.12th c.13th c.14th c.15th c.16th c.17th c.18th c.19th c.20th c.21st c.