Skip to content

[Enhancement]: Propagate ocr confidence to output hocr file

Jerome Flesch requested to merge Sqooba:enhancement/add-confidence-measure into master

Created by: a-pagano

This PR allows to parse the individual word confidence measures from Tesseract output and write them to the simplified output hocr file in the title attribute of the Box objects.

Example output: <span class="ocrx_word" title="bbox 638 1797 751 1823; x_wconf 70">Word</span>

Note: directly relates to #74 (closed) and #58 and less so to #12

Merge request reports