Allow tesseract 4.0.0alpha to be used with pyocr
Created by: ddddavidmartin
The current tesseract 4.0 version is still in alpha and returns the version string tesseract 4.00.00alpha
. This breaks the existing get_version
function as it expects integer values only.
To work around it this pull request simply only takes the starting digits of the version and returns these.
Note: I haven't really tried out how pyocr fares with tesseract 4. But, I am using it with paperless and it seems to be working fine for me so far.
How to test this:
- build and install the current tesseract 4.0.0alpha
- start consumption with paperless for example
- the current pyocr version fails with
pyocr.error.TesseractError: (0, 'Unable to parse Tesseract version (not a number): [4.00.00alpha]')