Skip to content

Releases: UB-Mannheim/ocr-fileformat

v0.6.0

23 Oct 10:35
v0.6.0
Compare
Choose a tag to compare

What's Changed

  • Add CodeQL workflow for GitHub code scanning by @lgtm-com in #155
  • gcv__page: use -source-json instead of -source-xml by @bertsky in #156
  • make install: use newline in sed c cmd by @bertsky in #158
  • Add textract2page by @bertsky in #160
  • ensure venv for Python tools by @bertsky in #162
  • add PRImA converter for GCV→ALTO by @bertsky in #163
  • Update Makefile to support macOS by @stweil in #165
  • update textract2page, hOCR-to-ALTO and alto-schema by @kba in #166
  • Fix two issues reported by CodeQL CI by @stweil in #161
  • Fix broken conversions from hOCR to ALTO by @stweil in #167
  • Replace broken Travis CI by GitHub action by @stweil in #168
  • Use first bash from PATH (allows running on macOS) by @stweil in #169

New Contributors

  • @lgtm-com made their first contribution in #155

Full Changelog: v0.5.0...v0.6.0

v0.5.0

08 Nov 08:52
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.4.0...v0.5.0

v0.4.0

18 Sep 11:04
Compare
Choose a tag to compare

Update JPageConverter and saxon9he, drop support for Python 2

v0.3.2

09 Jul 13:29
Compare
Choose a tag to compare
  • Fix error handling for missing wget, unzip or git

v0.3.1

25 Jun 12:42
Compare
Choose a tag to compare
  • Improve error handling for missing wget, unzip or git

v0.3.0

09 Jan 12:47
Compare
Choose a tag to compare
  • Improve PAGE support
  • Update ALTO support
  • Add new conversions, e.g. hOCR to TEI, ABBYY to hOCR, PAGE to ALTO, ABBYY / ALTO / GCV / hOCR to PAGE, GCV to hOCR
  • Add new command line option --version
  • Fix bugs

v0.2.3

11 Dec 13:19
@kba kba
Compare
Choose a tag to compare

Fixed

  • Fix download button in web interface #73
  • Fix https URL in Docker builds #75

Changed

  • Tab bar above input #72
  • Example URLs via https

Added

  • make help

Add transformation gcv2hocr and fixes some issues with web interface

10 Dec 21:01
@kba kba
8fa510e
Compare
Choose a tag to compare
  • Support new transformation from google cloud vision format to hocr
  • Fix format switching in transform web interface
  • Produce valid HTML
  • Use eslint for JS code style checking
  • Use best practices for Dockerfile

Update to new URLs for ABBYY schema and Docker fixes

27 Feb 12:16
Compare
Choose a tag to compare
  • Docker fixes (busybox/alpine incompatibilities + allow overriding web config) and add documentation for Docker #33, #45, #53
  • Update URLs to ABBYY schemas, add new PAGE format 2016-07-15 fded289
  • Switch to official filak/hOCR-to-ALTO repo, linking language codes lookup xml #48, #46, #52

Improved web interface, code cleanup and script support

13 Sep 13:57
Compare
Choose a tag to compare
  • Add option to run arbitrary scripts: In addition to XSD/XSLT, arbitrary executable scripts can be placed
    in ./script/validate and ./script/transform/, written in Python, bash or compiled C code.
  • Validation: hocr against hocr-check from tmbdev/hocr-tools
  • Web interface: Download button for transformation results
  • Web interface: Support file uploads for transformation and validation
  • Enable ALTO/hocr to plain text transformations
  • Code cleanup of the shared shell script library

More details: v0.1.0...v0.2.0