Commit 221cfdbe authored by Jerome Flesch's avatar Jerome Flesch

README: Document orientation detection

Signed-off-by: Jerome Flesch's avatarJerome Flesch <jflesch@gmail.com>
parent e8a47944
...@@ -31,6 +31,9 @@ bmp, tiff, and others. It also support bounding box data. ...@@ -31,6 +31,9 @@ bmp, tiff, and others. It also support bounding box data.
* hOCR: Only a subset of the specification is supported. For instance, pages and paragraph positions are not stored. * hOCR: Only a subset of the specification is supported. For instance, pages and paragraph positions are not stored.
## Usage ## Usage
### Initialization
```Python ```Python
from PIL import Image from PIL import Image
import sys import sys
...@@ -52,17 +55,26 @@ print("Available languages: %s" % ", ".join(langs)) ...@@ -52,17 +55,26 @@ print("Available languages: %s" % ", ".join(langs))
lang = langs[0] lang = langs[0]
print("Will use lang '%s'" % (lang)) print("Will use lang '%s'" % (lang))
# Ex: Will use lang 'fra' # Ex: Will use lang 'fra'
# Note that languages are NOT sorted in any way. Please refer
# to the system locale settings for the default language
# to use.
```
### Image to text
```Python
txt = tool.image_to_string( txt = tool.image_to_string(
Image.open('test.png'), Image.open('test.png'),
lang=lang, lang=lang,
builder=pyocr.builders.TextBuilder() builder=pyocr.builders.TextBuilder()
) )
word_boxes = tool.image_to_string( word_boxes = tool.image_to_string(
Image.open('test.png'), Image.open('test.png'),
lang="eng", lang="eng",
builder=pyocr.builders.WordBoxBuilder() builder=pyocr.builders.WordBoxBuilder()
) )
line_and_word_boxes = tool.image_to_string( line_and_word_boxes = tool.image_to_string(
Image.open('test.png'), lang="fra", Image.open('test.png'), lang="fra",
builder=pyocr.builders.LineBoxBuilder() builder=pyocr.builders.LineBoxBuilder()
...@@ -74,9 +86,42 @@ digits = tool.image_to_string( ...@@ -74,9 +86,42 @@ digits = tool.image_to_string(
lang=lang, lang=lang,
builder=pyocr.tesseract.DigitBuilder() builder=pyocr.tesseract.DigitBuilder()
) )
```
Argument 'lang' is optionnal. The default value depends of
the tool used.
Argument 'builder' is optionnal. Default value is
builders.TextBuilder().
### Orientation detection
Currently only available with Tesseract or Libtesseract.
```Python
if tool.can_detect_orientation():
orientation = tool.detect_orientation(
Image.open('test.png'),
lang='fra'
)
pprint("Orientation: {}".format(orientation))
# Ex: Orientation: {
# 'angle': 90,
# 'confidence': 123.4,
# }
``` ```
Angles are given in degrees (range: [0-360[). Exact possible
values depend of the tool used. Tesseract only returns angles =
0, 90, 180, 270.
Confidence is a score arbitrarily defined by the tool. It MAY not
be returned.
detect_orientation() MAY raise an exception if there is no text
detected in the image.
## Dependencies ## Dependencies
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment