SOLR Keyword Spotting | API
This search is only possible if the HTR has been post-processed (typically by UPVLC, contact info@readcoop.eu for questions)
Searching for keywords via the SOLR index can be done via GET request to
https://transkribus.eu/TrpServer/rest/keyword
with the following parameters:
querystring – the keyword to be searchedstartint (default: 0) – first resultrowsint (default: 10) – number of successive results to fetch- In order to process large amounts of hits, SOLR allows to define at a specific hit and show only the next N hits from there onward. This can be used to browse results page-wise (e.g. first page starts at 0 and shows 10 results, next page starts at 11 and shows next 10 etc.)
probLfloat – lower limit for keyword probability (usually between 0.0 and 1.0)probLfloat – upper limit for keyword probability (usually 1.0)- Each keyword is stored with a probability value. It is possible to limit searches to results above or below a certain probability. (Note: Currently, the keyword probabilities are stored directly as provided. To transform these probabilities into true relevance probabilities, a calibration function is required in the user interface.)
filterstring – allows to specify certain fields and values to filter search results (can take multiple values as in …&filter=cId:1895&filter=id:4243_221_*…)- fields to filter by are
id: (string) index element id, consisting of document id, page number and a running number for word on the page, separated by underscores -> e.g. 4432_15_10 would be word 10 on page 15 of document 4432. Setting a filter string to 4432_15_* would limit searches to this document and page; *_20_* would limit searches to page 20 of any document.title: (string) title of the documentcId: (int) collection idauth: (string) name of the author
fuzzy: int – takes all integer values, but SOLR currently only supports values between 0 and 2- SOLR allows to include results that differ in a certain amount of characters.
sortingstring – allows to sort by certain fields. (usually “rp desc” to show results with descending probability)
Example:
Searching for the keyword “london” in collection 1234 with any probability, displaying the first 100 results sorted by descending probability.
https://transkribus.eu/TrpServerTesting/rest/search/keyword?query=london&start=0&rows=100&probL=0.0&probH=1.0&filter=cId:1234&fuzzy=0&sorting=rp+desc
Related Articles

Can AI save bad scans?
The starting point for any kind of document digitization, whether done by hand or through sophisticated text recognition algorithms, is a good-quality image. Take a look at the one below. It is a...

Mapping Medieval Vienna: The digital edition of historical land registers supported by Transkribus
A central goal of the research project 'Mapping Medieval Vienna' is to make the Viennese land registers of the 15th century available to the public. This is because the land register entries contain...

Supporting Future Scholars: The Transkribus Scholarship Programme
Imagine you are a student who wants to dive into the personal story of one of the few famous child authors in history; or who wants to discover what made the authors of the Spanish Golden Age of...