Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Deskew #174

Open
mantielero opened this issue Jan 6, 2019 · 3 comments
Open

Support for Deskew #174

mantielero opened this issue Jan 6, 2019 · 3 comments

Comments

@mantielero
Copy link

It would be useful to be able to gather the deskew information for a page from --psm 2.

@bozhodimitrov bozhodimitrov changed the title [Feature Request] Deskew Support for Deskew Jan 6, 2019
@bozhodimitrov
Copy link
Collaborator

bozhodimitrov commented Jan 6, 2019

Hi @mantielero, can you post an example tesseract command (and if possible - sample image)?
It will be needed for reference and testing.

@mantielero
Copy link
Author

It should work with any sample image. The example would be:

$ tesseract image.png - --psm 2
Warning: Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 641
Orientation: 0
WritingDirection: 0
TextlineOrder: 2
Deskew angle: 0.0015

The value to gather would be Deskew angle.

@nok
Copy link
Contributor

nok commented Jul 27, 2019

Until now we always handle the page segmentation mode with value 0 (-psm 0 in pytesseract.py#L428). By using -psm 2 we could get the new values and transform these to the requested format.

The differences:

$ root@ee39b77b1b6a:~# tesseract ./test.png - -l eng -psm 2
Orientation: 0
WritingDirection: 0
TextlineOrder: 2
Deskew angle: -0.0038
$ root@ee39b77b1b6a:~# tesseract ./test.png - -l eng -psm 0
Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 15.38
Script: Latin
Script confidence: 466.67

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants