Tibetan Modern U-chen Print 0.1

Description

Tibetan Modern U-chen Print 0.1 (TMUP 0.1) is the first Transkribus HTR model for printed Tibetan language publications in Uchen (དབུ་ཅན་ dbu can) script. It has been trained on texts that were published in the PRC between the 1950s and 1980s. The model was trained on 522 pages in 20 documents. The training set consists of 470 pages; the validation set consists of 52 (10%) automatically selected pages. No basemodel was used. The model was developed by Franz Xaver Erhard (Leipzig University) and Xiaoying 笑影 (Leipzig University) for the Divergent Discourses project (DFG/AHRC). https://research.uni-leipzig.de/diverge/

Try this model

Drag an image here

Select a file...

PNG or JPG up to 10 Mb

Wolpi

AI Assistant

By uploading an image, you accept our terms and privacy policy.

Use this model Open in Transkribus

Very low error rate1.8% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 1.8% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words9,822

Lines7,989

Training Pages432

Model ID60669

Related models

Description

Try this model

Related models

Tibetan Generic 0.1

Tibetan cursive (Betsug)

Tibetan cursive (Drutsa)

PaganTibet Ume 5