Silvia Carboni for Mediobanca SpA · PyLaia · Published October 10, 2025
20th Century Typewritten Italian
Text Recognition
Description
This model was created by the Historical Archive of Mediobanca, an Italian investment bank, to transcribe a variety of typewritten documents in Italian and English, such as correspondence, articles, meetings' minutes, memos. The training set consists of 28.137 words. The model is trained to expand abbreviations specific to banking activities and people’s honorifics.
Mediobanca's internationalization has left many documents in English, so "Transkribus Print M1" was used as the base model: even if "20th Century Typewritten Italian" is specifically geared towards the Italian language, the model was also tested on documents in English, proving to be able to recognize words and short sentences with satisfying results.
This model achieved a Character Error Rate of 0.82%. This model was created alongside the "20th Century Typewritten Letters - Diplomatics' Elements" Field Model, as part of a larger project. It was trained by Silvia Carboni for Mediobanca's Historical Archive.
Try this model
Drag an image here
Select a file...PNG or JPG up to 10 Mb
Wolpi
AI Assistant
By uploading an image, you accept our terms and privacy policy.
Use this modelOpen in Transkribus
Very low error rate0.82% CER
Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 0.82% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.
Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.
Words28,137
Lines4,692
Training Pages187
Model ID413877