Skip to content
  • Pricing
a.romein · PyLaia · Published May 5, 2026

Latvian model 19th century

Text Recognition

Description

Description This text recognition model was created with the late 19th manuscripts from the collection of the Scientific Commission of the Riga Latvian Society, preserved at the Archives of Latvian Folklore (ALF). The materials of the Scientific Commission form the oldest and one of their most extensive collections. It contains unique handwritten sources representing a wide range of folklore genres, together with valuable ethnographic and linguistic notes that support the interpretation, study, and publication of Latvian folk traditions. The manuscripts were collected during a period of growing scholarly and public interest in Latvian oral tradition, language, and cultural heritage. Many of the materials include contextual information about the time and place of performance, explanations of dialectal or unusual words, references to place names, vocabulary lists, botanical terminology, riddles, songs, and other forms of traditional knowledge. As the largest folklore collecting centre in Latvia during the nineteenth century, the Scientific Commission of the Riga Latvian Society played an important role in documenting Latvian oral culture and strengthening the foundations of Latvian folklore studies. The model was trained on previously prepared manuscript transcriptions produced by the volunteer contributor community of the ALF. It is intended to support the transcription and study of nineteenth-century Latvian handwritten materials, especially folklore manuscripts. It is designed to assist researchers, archivists, students, and other users working with the manuscript heritage of the Scientific Commission of the Riga Latvian Society and the broader documentary legacy in Latvian. This model was developed as part of the project ȬPEN: Open Knowledge Ecosystems for the Advancement of Citizen Science, funded by the University of Latvia (ZDA-LIP 2025/2). Work on the model began during the Baltic Summer School of Digital Humanities 2025 (BSSDH 2025) and was further developed in collaboration between Transkribus, the University of Latvia Digital Humanities Center (UL DHC), the University of Latvia Library (ULL), and the UL ILFA Archives of Latvian Folklore (ALF). Contributors: Sanita Reinsone (UL DHC), Annija Grīsle (UL), Zintis Gūts (LU), Kristīne Plostniece (UL), Sandis Laime (ALF), Uldis Ķirsis (ALF), Annemieke Romein (UTwente/READ COOP SCE), Bettina Anzinger (READ COOP SCE). Participants of the BSSDH 2025: Sylwia Lech, Olesja Beketova, Vladislavs Babaņins.

Try this model

Drag an image here

Select a file...

PNG or JPG up to 10 Mb

Wolpi
AI Assistant

By uploading an image, you accept our terms and privacy policy.

Use this modelOpen in Transkribus
Low error rate5.46% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 5.46% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material. This is a larger model trained on diverse material, which generally makes it more robust across different handwriting styles. That said, larger training sets also make it harder to push the CER down further.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words453,051
Lines164,080
Training Pages3,439
Model ID562817
Languages
Latvian
Centuries
19th c.