Skip to content

Commit

Permalink
Merge pull request #503 from madmaze/fix_default_hocr_config
Browse files Browse the repository at this point in the history
Fix default hocr config
  • Loading branch information
int3l committed Aug 25, 2023
2 parents 1fd3e60 + b560563 commit e0b8104
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions pytesseract/pytesseract.py
Original file line number Diff line number Diff line change
Expand Up @@ -445,6 +445,10 @@ def image_to_pdf_or_hocr(

if extension not in {'pdf', 'hocr'}:
raise ValueError(f'Unsupported extension: {extension}')

if extension == 'hocr':
config = f'-c tessedit_create_hocr=1 {config.strip()}'

args = [image, extension, lang, config, nice, timeout, True]

return run_and_get_output(*args)
Expand Down

0 comments on commit e0b8104

Please sign in to comment.