Transkribus · Super Model · Published January 29, 2024

Dutch Demeter I. (Super Model)

Text RecognitionScholar+

Description

"The Dutch Demeter I" represents a groundbreaking approach to the transcription of historical documents, specifically tailored for the intricate manuscripts and printed materials in Dutch. Trained on an expansive corpus of over 18 million words, this cutting-edge model, known as a supermodel, incorporates significant contributions from various data donors. It demonstrates remarkable proficiency in handling a diverse array of document types, from notarial handwriting and chronicles to Dutch East India Company records, spanning the 16th to the 20th centuries. The model's advanced capabilities ensure unparalleled accuracy and depth in interpreting the complex handwriting characteristics of different historical periods, establishing itself as a cornerstone in the field of digital humanities and archival research. We extend our gratitude to the data donors for Dutch Demeter, whose contributions (in many cases on behalf of projects that they ran with many other unnamed contributors) are listed below in alphabetical order of their institutions: Amsterdam City Archives - Pauline van den Heuvel - Jirsi Reinders (also affiliated with Huygens Institute) Archivo Nacional Aruba: - Johny van Eerden Huygens Institute: - C. Annemieke Romein - Ineke Huysman and the Johan de Witt-team - Geertrui Van Synghel Instituut voor Nederlandse Taal: - Ruud de Jong - Nicoline van der Sijs - Katrien Depuydt - Jesse de Does KB Library of the Netherlands - Michel de Gruijter - Sara F. Veldhoen Leiden University: - Jesse Dijkshoorn - Johan Visser - Bram Caers (NWO-Veni 016.Veni.195.371) Leuven University: - Jarrik Van Der Biest Nationaal Archief: - Liesbeth Keijser - Vincent Noppe - Vincent Noppe - Alan Moss - Transcribenten Noord-Hollands Archief - Nico Vriend State Archives of Belgium: - Gert Gielis (PARDONS project) Utrechts Archief: - Heleen Wilbrink - Rick Companje - Joyce Pennings - Floortje Tuinstra - Kathleen Verdult - Petra Dreiskämper - Annelot Vijn and many volunteers. Zeeuws Archief: - Michiel van Wijngaarden

Try this model

Dutch Demeter I. (Super Model)
Use this modelOpen in Transkribus
Very low error rate4.9% CER

Character Error Rate (CER) measures the percentage of characters incorrectly recognised. Lower is better. This model scored 4.9% on its validation set. As a rule of thumb, a CER below 10% is considered good for most handwritten material. Super Models are trained on very large and diverse datasets, making them robust across a wide range of handwriting styles and languages. Because of this diversity, a low CER on the validation set is a strong indicator of general-purpose quality.

Measured on the model's own validation data. Results on your documents may differ depending on handwriting style, document condition, language, and how closely your material resembles the training data.

Words18,000,000
Model ID58997
Languages
Dutch