Layout Analysis | API

Starting layout analysis processes via the API can be done with POST requests to

https://transkribus.eu/TrpServer/rest/LA

The following query parameters are available on this endpoint:

  • collId: the collection ID you with the documents you want to process
  • doBlockSeg
    • true -> existing layout will be deleted
    • false (default) -> keep existing text block regions
  • doLineSeg
    • true (default) -> detect lines in text blocks
    • false -> keep existing lines
  • doPolygonToBaseline
    • true -> inspect line polygons and add baselines
    • false (default) -> keep existing baselines
  • doBaselineToPolygon
    • true -> extrapolate new line polygons from baselines
    • false (default) -> skip
  • jobImpl: the tool to use, default (omit this parameter) is “TranskribusLAJob” which is recommended for most documents

The request body specfies the pages to be processed, in terms of document IDs and page IDs. Optionally, a transcript ID (tsId) can specify a transcription version and PAGE XML region element IDs can be passed for processing specific sections of a page. The endpoint accepts JSON or XML:

{
   "docList" : {
      "docs" : [ {
         "docId" : 1543,
         "pageList" : {
            "pages" : [ {
               "pageId" : 1234,
               "regionIds" : [ "the_xml_id_of_a_text_region" ]
            }, {
               "pageId" : 12345,
               "tsId" : 1234567
            } ]
         }
      } ]
   }
}

Equivalent XML representation:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<jobParameters>
    <docList>
        <docs>
            <docId>1543</docId>
            <pageList>
                <pages>
                    <pageId>1234</pageId>
                    <regionIds>the_xml_id_of_a_text_region</regionIds>
                </pages>
                <pages>
                    <pageId>12345</pageId>
                    <tsId>1234567</tsId>
                </pages>
            </pageList>
        </docs>
    </docList>
</jobParameters>

If successful (HTTP status code 200), the response will contain a job status object with a jobId that can be used to monitor the progress (see Job API).

Related Articles

Can AI save bad scans?

Can AI save bad scans?

The starting point for any kind of document digitization, whether done by hand or through sophisticated text recognition algorithms, is a good-quality image. Take a look at the one below. It is a...

Mapping Medieval Vienna: The digital edition of historical land registers supported by Transkribus
Success StoryMedievalArchives+1Austria

Mapping Medieval Vienna: The digital edition of historical land registers supported by Transkribus

A central goal of the research project 'Mapping Medieval Vienna' is to make the Viennese land registers of the 15th century available to the public. This is because the land register entries contain...

Supporting Future Scholars: The Transkribus Scholarship Programme

Supporting Future Scholars: The Transkribus Scholarship Programme

Imagine you are a student who wants to dive into the personal story of one of the few famous child authors in history; or who wants to discover what made the authors of the Spanish Golden Age of...