Project Beta maṣāḥǝft: Manuscripts of Ethiopia and Eritrea · PyLaia · Published November 30, 2022
Manuscripts of Ethiopia and Eritrea
Text Recognition
Description
Model for the transcription of
Manuscripts of Ethiopia and
Eritrea in Classical Ethiopic (Gǝʿǝz).
Trained as part of the Beta maṣāḥǝft project
and in order to feed a workflow to
import transcriptions into the project's database.
Transcriptions for the training have been kindly provided by
- Alessandro Bausi for ESum039, ff. 16vb-29va;
- Antonella Brita for DAS002, 101va-110ra;
- Dorothea Reule for ESqdq004, ff. 97ra-101vb, 104ra-109rb.
- Nafisa Valieva for BLorient718, ff. 1ra-7vb, images British Library.
- Several parts of manuscripts transcribed by Jeremy Brown and pertaining to the Miracle of the Cannibal of Qemer.
Importing of images and transcriptions in Transkribus
has been done by
Pietro Liuzzo
The project Beta maṣāḥǝft: Manuscripts of Ethiopia
and Eritrea (Schriftkultur des christlichen Äthiopiens
und Eritreas: eine multimediale Forschungsumgebung)
is a long-term project funded within the framework of
the Academies' Programme (coordinated by the Union
of the German Academies of Sciences and Humanities)
under survey of the Akademie der Wissenschaften in
Hamburg. The funding will be provided for 25 years,
from 2016–2040. The project is hosted by the Hiob
Ludolf Centre for Ethiopian Studies at the Universität
Hamburg. It aims at creating a virtual research
environment that shall manage complex data related
to the predominantly Christian manuscript tradition
of the Ethiopian and Eritrean Highlands.
Try this model
Drag an image here
Select a file...PNG or JPG up to 10 Mb
Wolpi
AI Assistant
By uploading an image, you accept our terms and privacy policy.
Use this modelOpen in Transkribus
Very low error rate3.8% CER
Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 3.8% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.
Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.
Words53,830
Lines21,173
Training Pages282
Model ID48371