jmgronski · PyLaia · Published July 11, 2025

Civil Records Reader

Text Recognition

Description

Civil Records Reader is a high fidelity Transkribus Model for 19th century Yiddish handwriting, especially for the birth and death records from the Russian Empire Civil Record books. Trained on hundreds of books of civil records of the Pale of Settlement, it delivers reliable, hallucination-free accuracy on the documents genealogists consult most. Researchers can also use its output as ground truth to fine tune specialist models for particular archives or scribes.1 Ideal for: Jewish genealogists (professional & amateur), historians of the Pale of Settlement, archivists, genealogists and historians processing mass vital-record collections. Credits This work was made possible by L’Dor V’Dor AI Lab Yiddish team with the generous support of the American Jewish Joint Distribution Committee (JDC), LitvakSIG, YIVO, and numerous individual volunteers who contributed documents and transcriptions. This model used the Dybbuk for Yiddish Handwriting model as a base developed by Sinai Rusinek and her team For more information, please visit: https://ldvdf.org

Try this model

Civil Records Reader
Use this modelOpen in Transkribus
Low error rate6.72% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 6.72% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words39,625
Lines21,322
Training Pages335
Model ID371405
Languages
Yiddish