Silvia Carboni for Mediobanca SpA · PyLaia · Published September 20, 2024

Italian - 20th Century Minutes of Mediobanca's Board of Directors & Exec. Committee

Text Recognition

Description

The “Italian - 20th Century Minutes of Mediobanca's Board of Directors & Exec. Committee” was created to transcribe the handwritten Minutes of the Board of Directors and Executive Committee of Mediobanca, an Italian investment bank. Minutes in Italian from that time period were written by people trained in calligraphy: their handwritings are all similar in form, despite differences in terms of ductus. Thus, this model could be useful for transcribing other 20th century Minutes. The Training Set consists of 13 different handwritings and 90729 words (about 50 pages for each handwriting). A previously created and non-public model was used as the base model. This previous model was trained with a ground truth of 25 pages for each of the 6 most frequent hands in the Minutes. The documents have some abbreviations: those specific to banking activities and people’s titles have been expanded. The model achieved a Character Error Rate of 2.38%. It was trained by Silvia Carboni for Mediobanca’s Historical Archive.

Try this model

Italian - 20th Century Minutes of Mediobanca's Board of Directors & Exec. Committee
Use this modelOpen in Transkribus
Very low error rate2.38% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 2.38% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words90,729
Lines14,728
Training Pages498
Model ID178985
Languages
Italian
Centuries
20th c.