Digital Ottoman Corpora Team, Suphan Kirmizialtin · PyLaia · Published May 29, 2023

OttomanTurkish_Print_1

Text Recognition

Description

This print model is trained on a selection of six late 19th -early 20th century Ottoman Turkish periodicals and an Ottoman Turkish dictionary. It was created by the Digital Ottoman Corpora team. We adhered to the “half-transcription” latinization scheme recommended by the Turkish Historical Association for late Ottoman era print material.

Try this model

OttomanTurkish_Print_1
Use this modelOpen in Transkribus
Low error rate7.2% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 7.2% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words180,854
Lines23,758
Training Pages386
Model ID52502
Languages
Turkish Ottoman (1500-1928)