Burgenland Croatian Typewritten 2010-2019

Description

Burgenland Croatian Typewritten 2010–2019 is a Text Recognition model, curated and trained by Marija Đokić Petrović (School of Computing, Union University Belgrade, Serbia) and Marko Simonović (Institute of Slavic Studies, University of Graz, Austria). The model is trained on a dataset consisting of three issues of the weekly newspaper "Hrvatske novine" (18 June 2010; 17 October 2014; 11 January 2019), obtained from ANNO/Austrian National Library. As the first dedicated model for Burgenland Croatian (Glottolog: burg1244; IETF: ckm-AT), it focuses on post-2000 printed material—a period marked by intensified standardisation—and is therefore optimised for texts published after 2010. The model will be updated regularly to improve its accuracy.

Try this model

Drag an image here

Select a file...

PNG or JPG up to 10 Mb

Wolpi

AI Assistant

By uploading an image, you accept our terms and privacy policy.

Use this model Open in Transkribus

Very low error rate2.48% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 2.48% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words58,405

Lines12,298

Training Pages80

Model ID442685

Related models

Description

Try this model

Related models

Glagolitic printings PyLaia

Transkribus Print M1

Ruthenian Chuhaister

DiJeSt 3.0