Gerard Farrell · PyLaia · Published November 4, 2023

Irish, Gaelic and Roman type (Seanchló agus Cló Rómhánach) v.3

Text Recognition

Description

Model for reading Irish Gaelic (Gaeilge) type or seanchló (common pre-mid-20th century). Can also read Irish in the standard Roman typeface used today. This model was trained on over 70,000 words of material in various typefaces from the 17th century to the early 20th, leaning more heavily towards books published from the mid-19th century in Cló Newman. The model can, however, handle text printed in earlier fonts, such as Cló Petrie, which was used in O'Donovan's edition of the Annals of the Four Masters, and the earlier Cló Moxon used in Bedell's Irish version of the Old Testament (1685). Dotted consonants are transcribed as the consonant followed by a 'h', following modern Irish convention, and the Tironian ⁊ is transcribed as 'agus'. Around 30% of the training material also consisted of modern printed Irish texts.

Try this model

Irish, Gaelic and Roman type (Seanchló agus Cló Rómhánach) v.3
Use this modelOpen in Transkribus
Very low error rate1.2% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 1.2% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words70,965
Lines8,833
Training Pages243
Model ID56262
Languages
Irish