Description
This is a first attempt on a model trained for texts in the Romanian Transition Alphabet (1830-1862).
In order to train an HTR model for these texts, I have chosen 5 samples that show, before and after 1859, when the 2 Romanian provinces become a country with an official language, the progression from a massive use of Cyrillic letters to an eye-friendly employment, which makes reading more fluent.
As a general rule, Latin capital letters are preferred for writing titles after 1859.
The Latin letters Z/ z, M/ m, D/ d, S/ s, T/ t, N/ n, A/ a, I/ i, E/ e, O/ o, Î/ î, U/ u, Ŭ/ ŭ, Ĭ/ ĭ are present from the oldest sampled text (1853), whereas the Cyrillic Х/х (ha), Ш/ ш (sha), Щ/ щ (shcha), Ц/ ц (tze), Џ/ џ (dze), Ч/ ч (che), Ъ/ ъ (ă), П/ п (pe), Р/ р (er), Ж/ ж (zhe), Ф/ф (ef), К/ к (ca), В/ в (ve), Л/ л (el), Г/ г (ghe), Б/ б (be).
Among these Cyrillic letters, the first to receive a Latin equivalent are: Ф/ф (ef) → f; Г/ г (ghe) → g; Л/ л (el) → l; Ж/ ж (zhe) → j. At the same time, Р/ р (er), П/ п (pe), Ъ/ ъ (ă), Ч/ ч (che), В/ в (ve), Ш/ ш (sha), Щ/ щ (shcha), Ц/ ц (tse) tend to be maintained until 1862, when some of them they are replaced with glyphs such as “ḑ” (dz), “ş” (sh) and “ț” (tz), which were imported from the Livonian alphabet but have entered the printing circuit only after 1865.
The general guidelines for transcription have been established as follows:
1. Creation of the collection “ALFABET DE TRANZITIE” containing 6 items.
2. Random transcription of initial, middle, and end pages.
3. Transliteration one-on-one of all Cyrillic letters excepting the situations when K/k stands for the group Ch/ ch (e.g. Бukete → Bukete):
Х/х → H/ h; Ш/ ш → Ș/ ș; Щ/ щ → Șt/ șt; Ц/ ц → Ț/ ț, Ч/ ч → C/ c;
Ъ/ ъ → Ă/ ă; П/ п → P/ p; C/c → S/s; Р/ р → R/ r; Ж/ ж → J/j; Ф/ф → F/ f;
К/ к → C/c; В/ в → V/ v; Л/ л → L/l; Г/ г → G/ g; Б/ б → B/ b; Џ/ џ → G/ g.
4. Customization of the following glyphs:
apostrophe, right double quotation mark, double low-9 quotation mark, Ŭ/ ŭ, Ĭ/ ĭ, á.