Releases: OCR-D/ocrd_all
v2024-03-08
Removed:
- tesserocr and tesseract (which are now submodules of ocrd_tesserocr)
core f54b002..c5b5580
Release: v2.63.3
- 📦 v2.63.3
- 📝 changelog
- OcrdMets.add_file: fix finding existing el_pagediv
- 📝 changelog
- expose uninstall-workaround
- 📦 v2.63.2
- 📦 v2.63.1
- Merge branch 'fix-get-physical-pages'
- make coverage: omit generateDS code
cor-asv-ann 0a4f684..f7ebb74
- compare/evaluate: ensure lines are shown verbatim on single file level
- compare/evaluate: do not record worst lines verbatim
- compare/evaluate: also record 1% worst lines
ocrd_keraslm 759805b..472197f
Release: v0.4.2
- update assets
- 📦 v0.4.2
- generators: do not require input length > window size
- train: allow passing directory for training data, too
- add checkpointing, allow continuing from ckpt
- suppress TF gibberish
ocrd_tesserocr 08a020f..ed73d96
Release: v0.18.1
- :package v0.18.1
- 📝 changelog
- update tesseract/tesserocr to most recent
- update/improve readme
- simplify dockerfile
- CI: add make test
- add repo/assets as proper submodule, rename -clean → clean-
- make Tesseract build configurable
- also install minimal needed models
- explify dependencies install ← install-tesserocr ← install-tesseract
- test: download files of data assets
- test: also set up logging system during tests
workflow-configuration 8418b3f..bd149f8
Release: 0.1.3
- unprefix regex paths by directory argument, if any
- forgot to convert exit to continue in 6fe3c6b3
- no need to close logger FDs on exit
- ocrd-import: use coproc instead of handcrafted FIFOs for loggers
- ocrd-import: rewrite (no parallel jobs, but parallel logging)…
- ocrd-import: add option --basename, default to using directory as well
- ocrd-import: simplify+speedup…
v2024-02-01
Added:
Removed
- https://github.com/qurator-spk/page2tsv/ (temporarily)
cor-asv-fst 076e04e..4211371
- Merge pull request #4 from stweil/master
core ac1f15b..b94b185
Release: v2.62.0
- reenable circle because docker failed to build on ghcr.io
- 📦 v2.62.0
- {pypi,build}-workaround: missed a commit, s/get_distribution(...).version/dist_version(...)
- Merge branch 'circle-to-gha'
- 📝 changelog
- Merge branch 'master' into ocrd-tool-json-root
- expose ocrd-tool.json for ocrd-dummy in root like processors
- 📦 v2.60.3
- 📝 changelog
- Fix --editable install for setuptools>=64, setuptools#3548
- 📦 v2.60.2
- 📝 changelog
- Merge pull request #1161 from OCR-D/logging-downgrade-level
- Merge pull request #1160 from OCR-D/is-oai-content-loglevel-debug
dinglehopper f077ce2..f8e3108
Release: v0.9.4
- 🚧 GitLab CI Test: Push after pulling
- 🚧 GitLab CI Test: Trigger only on default branch (and do not hardcode it)
- 🚧 GitLab CI Test
- 🔍 ruff: Remove ignore configuration, we use multimethods in a compatible way now
- ⚙ pre-commit: Update hooks
- 🚧 GitLab CI Test
- 🔍 mypy: Use an almost strict mypy configuration, and fix any issues
- 🔍 mypy: Use a compatible syntax for multimethod
- 🔍 mypy: Remove ExtractedText.segments converter
- 🔍 mypy: Avoid using check() for all attr validators
- 🔍 mypy: Make cli.process() typed so mypy checks it (and issues no warning)
- Merge branch 'pr103'
- ⚙ Update ruff+mypy dependencies
- ⚙ pre-commit: Update hooks
- ⬆ Move on to supporting Python >= 3.8 only
- 🐛 Use typing.List instead of list, for Python <3.9
- 🐛 Use Optional instead of | none, for Python <3.10
- ⚙ pre-commit: Update hooks
- 🐛 Fix generating word differences
- ⚙ pre-commit: Update hooks
- Merge branch 'master' of https://github.com/qurator-spk/dinglehopper
- Merge branch 'master' into performance
- ⬆ Update uniseg dependency
- ❎ Make joining grapheme clusters more robust by checking joiner and handling an empty joiner
- 🐛 Fix score_hint call in cli_line_dirs
- 🐛 Fix docstring of distance() for grapheme clusters
- 🐛 Fix calculation of score_hint for edge cases, e.g. when CER is infinite
- 🕸Do not use deprecated ID, pageId options
- ✔ Add mets:FLocat's @LOCTYPE/OTHERLOCTYPE to test data
- ⬆ Update multimethod dependency
- 🐛 Update tests for ExtractedText
- use uniseg again
- update rapidfuzz version
- replace uniseg with uniseg2
- apply black
- move grapheme clusters to ExtractedText
- remove python2.7 futures
- remove unused includes
- only call
words_normalized
once
eynollah 706433c..032a99e
Release: v0.2.0
- adapt to OcrdFile.local_filename now :Path
- adapt to ocrd>=2.54 url vs local_filename
ocrd_fileformat c5f0c52..ba79de9
Release: v0.10.0
- 📦 v0.10.0
- 📝 changelog
- Update ocr-fileformat to include UB-Mannheim/ocr-fileformat#172
- Merge branch 'fix-textract2page'
- update ocr-fileformat to latest
- Update ocr-fileformat to v0.6.0
ocrd_repair_inconsistencies cf879c1..94c482f
- 🕸 README: Mention archival of the project
opencv-python 7cfd1ee..8ad8ec1
Release: 80
- Merge branch 'as/4.9.0-readme-update' into 4.x
- Merge pull request #941 from asmorkalov/as/mac_m1_venv_for_test
- Merge pull request #940 from asmorkalov:as/donation
- Merge pull request #938 from asmorkalov"as/4.9.0-pre
- Merge pull request #934 from asmorkalov/as/native_mac_m1_runner
- Merge pull request #936 from asmorkalov:as/mac_intel_update
- Merge pull request #932 from asmorkalov/as/pre-4.9.0_linux_upgrade
- Merge pull request #931 from asmorkalov:as/ipp_icv_license
- Merge pull request #904 from asmorkalov:as/python_3.12
- Merge pull request #927 from dkurt:try_enable_dependents
sbb_binarization f3c6ac8..b89ec49
Release: v0.1.0
- Merge pull request #65 from rettinghaus/update-tests
tesseract ea0b245..8ee020e
Release: 5.3.4
- Create new release 5.3.4
- Set User-Agent: header field in HTTP request for curl downloads
- Merge pull request #4178 from sadra-barikbin/patch-1
- Merge pull request #4174 from stweil/warnings
workflow-configuration cbc3234..f54c91a
Release: 0.1.3
- ocrd-page-transform: local_filename instead of url
- ocrd-page-transform: fix unbound variable
v2023-12-15
core 742906e..ac1f15b
Release: v2.60.1
- 📦 v2.60.1
- docker: we need .git during build for setuptools_scm
- 📝 changelog
- Merge branch 'git-versioning'
- 📝 changelog
- defaults for mets_basename and mets_server_url
- Merge branch 'master' of https://github.com/OCR-D/core
- 📝 changelog
- ocrd workpace list-page: ignore files without pageId, fix #1148
v2023-12-07
core d8c5813..742906e
Release: v2.59.1
- 📦 v2.59.1
- 📝 changelog
- Merge pull request #1142 from OCR-D/fix-ws-locks
- 📝 changelog
- fix ranges, use numpy
- 📦 v2.59.0
- 📝 changelog
- Merge branch 'network-workflow-api'
- 📝 changelog
- Merge pull request #1139 from OCR-D/bagger-filegrp-filter
- 📝 changelog
- Merge branch 'list-page-extended'
- 📝 changelog
- Merge pull request #1136 from OCR-D/network-processor-api
- 📝 changelog
- bagger: fallback to ID-derived filename to avoid filename conflicts
- test bagger: sample workflow with conflicting basenames
- bagger test: rewrite as pytest
- 📝 changelog
- Merge pull request #1134 from OCR-D/page-update
- 📝 changelog
- Merge branch 'update-apidocs'
- OcrdMets: remove dead code exit method, #1130
dinglehopper dbaccdd..f077ce2
Release: v0.9.4
- 🐛 dinglehopper-summarize: Handle reports without difference stats
- Merge pull request #97 from qurator-spk/clean-remove-six-dep-again
- Merge pull request #96 from qurator-spk/test-on-pr-but-really
- Merge pull request #94 from qurator-spk/test-on-pr
- Merge pull request #93 from qurator-spk/update-dep-multimethod
- Merge pull request #92 from qurator-spk/update-pre-commit
- Merge pull request #91 from qurator-spk/test-remove-circleci
- Merge pull request #90 from qurator-spk/test-on-python-3.12
- ✔ Add mets:FLocat's @LOCTYPE/OTHERLOCTYPE to test data
format-converters 9615db1..fa8b4b5
- Merge pull request #24 from stweil/orientation
- Merge pull request #18 from stweil/rotate
- Merge pull request #23 from stweil/TextEquivNone
- Merge pull request #22 from stweil/image
- Merge pull request #19 from stweil/page_version
- Merge pull request #21 from stweil/image_dimensions
- Merge pull request #20 from stweil/close_statement
- Merge pull request #17 from stweil/update
ocrd_calamari c0a4dfd..ad8febd
Release: v1.0.6
- ⚙ pre-commit: update config (using pre-commit-update hook)
- 🎨 Remove extra whitespace in CircleCI confg
- ✔ CircleCI: Update codecov orb
- Merge pull request #102 from OCR-D/review-test-base
- Merge pull request #106 from OCR-D/setup-update-description
- ✔ CircleCI: Try to fix caching of pip cache
- Merge pull request #107 from OCR-D/circleci-pip-cache-1
- Merge pull request #101 from OCR-D/use-ruff-config-like-dinglehoppers
- Merge pull request #100 from OCR-D/fix/set-default-MODEL-for-bare-pytest
- 🐛 Fix importing + exporting test.base.assets
- 🐛 pre-commit/mypy: Install missing types-setuptools depedency
- 🧹 Remove unused exports (done by ruff --fix .)
- 🛠Add pre-commit configuration
- 🐛 Require ocrd >= 2.54.0 for tf_disable_interactive_logs
- ✒ README: Minor changes
- ✒ README: Fix and simplify example instructions
- 🎨 Reformat using Black + ruff
- 🧹 Remove protobuf restriction (→ calamari-ocr 1.0.6 has it now)
- Merge pull request #78 from mikegerber/test-python-3.11
- 💩 Add a script fix-calamari1-model to fix regexen in 1.0 models
- ✒ README: Wrap long line
- 🧹 Remove workaround for fixed np.str problem in Calamari
ocrd_neat 0f64f07..2f7d01c
Release: v0.0.1
ocrd_pagetopdf 7368f51..24f77d3
Release: v1.1.0
- Use first bash from PATH (allows running on macOS)
- Merge pull request #23 from kba/mets-server-url-support
tesseract dc228ed..ea0b245
Release: 5.3.3
- Update status badge for GitHub workflow sw (add missing line break)
- Update status badge for GitHub workflow sw
- Correct indefinite articles before vowels
- Update issue-bug.yml
- Merge pull request #4162 from tfmorris/3000-tcp-scrollview
- Avoid conversions from std::string to char* to std::string
- Remove unnecessary conversions from std::string to C string
- Remove whitespace at line endings
- Update sw.yml
- Update README.md
- Update sw.yml
- Fail on curl download errors
- Add new parameter curl_cookiefile for curl_easy_setopt
- ci: Fix clang build for Ubuntu 22.04
- ci: Allow manual trigger of unittest
- Move bail_out function before libtoolize check
- Merge pull request #4150 from tfmorris/4149-directory-to-stdout
workflow-configuration d0d208c..cbc3234
- add page-ensure-textequiv-conf.xsl
- add page-rename-id-clashes.xsl
- PAGE CLIs: allow combining --pretty and --diff
- new pair of XSLTs: un/flatten table cells
- require ocrd >= 2.58.1
- require ocrd >= 2.58
- ocrd-page-transform: support --mets-server-url
v2023-10-20
core caa7ac3..d8c5813
Release: v2.58.1
- 📦 v2.58.1
- 📝 changelog
- bashlib: set empty string as default value for ocrd__argv[mets_server_url], OCR-D/olena#95
- 📦 v2.58.0
- 📝 changelog
- Merge pull request #1127 from OCR-D/fix-bashlib-log3
- 📝 changelog
- ocrd workspace bulk-add now supports --mets-server-url
- bashlib: short option -U for --mets-server-url
- 📝 changelog
- Merge remote-tracking branch 'origin/fix-bashlib-log2'
- 📝 changelog
- Merge branch 'mets-server-reload'
- 📝 changelog
- Merge pull request #1121 from OCR-D/fix-bashlib-log
ocrd_fileformat cfb24ec..7364265
Release: v0.9.1
- 📦 v0.9.1
- 📝 changelog
- require ocrd >= 2.58.1
- 📦 v0.9.0
- 📝 changelog
- require ocrd >= 2.58
- Support --mets-server-url; get local_filename not url
ocrd_im6convert 105697f..db18917
Release: v0.1.1
- 📦 v0.1.1
- 📝 changelog
- require ocrd >= 2.58.1
- 📦 v0.1.0
- 📝 changelog
- minversion must be MAJOR.MINOR.PATCH
- require ocrd >= 2.58
- add --mets-server-url support
ocrd_olena a2e2520..c1f7cab
Release: v1.5.0
- 📦 v1.5.0
- 📝 changelog
- require ocrd >= 2.58.1
- debug
- minversion must be MAJOR.MINOR.PATCH
- require ocrd >= 2.58
- support --mets-server-url
ocrd_pagetopdf 4f4a330..7368f51
Release: v1.0.0
- require ocrd >= 2.58.1
- require ocrd >= 2.58
- Use local_filename, not url
- restore $zeros assignment
- Support --mets-server-url
workflow-configuration f574f82..d0d208c
Release: 0.1.3
- require ocrd >= 2.58
- ocrd-page-transform: support --mets-server-url
v2023-10-17
cor-asv-ann 006a70e..4216c16
Release: v0.1.14
- adapt to Numpy deprecations
- CI: use ocrd/core-cuda as base image
- CI: dummy venv
- CI: use proper tab character
- CI: clone first
- CI: mkdir first
- CI: chdir to tmp location
- CI: use /tmp for aux clone of ocrd_all
- try getting tensorflow-gpu from Nvidia
- use proper URLs for submodules
- Merge pull request #6 from kba/init-report-dict
- evaluate: skip pages with no results
core de08453..3b0307a
Release: v2.56.0
- 📦 v2.56.0
- 📝 changelog
- Merge branch 'network-log-refactoring'
- 📦 v2.55.2
- 📝 changelog
- 🐛 pydantic fields must not start with underscore
- 📦 v2.55.1
- 📝 changelog
- Merge pull request #1113 from OCR-D/bulk-add-url-local-filename
- 📦 v2.55.0
- 📝 changelog
- Merge branch 'workflow-endpoint'
- 📝 changelog
- generate_page_range: verify single-page range based on start
- generate_page_range: warn, not raise, if start==end, fix #1106
- 📝 changelog
- logging: remove custom logging in ocrd_network, use explicit logger name
- logging remove hard-coded setLevel in decorators/ocrd_network
- helpers.ruin_processor: setOverrideLogLevel if log_level is provided
- ocrd log: default to ocrd.log_cli logger name
- METS Server: add basic logging of operations
- logging: use ocrd.{utils,models,exif} not ocrd_{utils,models,exif}
- ocrd_utils.logging.getLogger: no more initLogging
- 📝 changelog
- Merge branch 'master' into mets-server-fixes-2023-09-15
- mets server: remove socket file on shutdown
- mets server: do the chmod before server start, not before connection
- 📝 changelog
- METS server: make socket world-readable/-writable
- 📦 v2.54.0
- Merge pull request #1095 from OCR-D/run-cli-mets-server-url
- 📝 changelog
- Merge branch 'revise-logging'
- Merge pull request #1093 from OCR-D/create-default-queue
- Merge pull request #1080 from OCR-D/revise-logging
- bashlib: fix --help output
- 📝 changelog
- Merge branch 'keep-remote-links'
- 📝 changelog
- downgrade ValueError to log.warning about inconsistent pageId for processor calls
- Merge branch 'master' into warn-empty-page
- raise ValueError if --page-id is provided but leads to empty result
- 📝 changelog
- Merge pull request #1069 from OCR-D/processing_server_ext_1046
- Merge branch 'mets-server'
- 📝 changelog
- ci: localhost -> 127.0.0.1
- pin requests < 2.30, OCR-D/core#1082
- mets server: forbid local/remote workspace with different directories
- mets server: allow both local_filename and url to be None
- Merge branch 'master' into mets-server
- mets server: test both UDS and TCP variant
- ClientSideOcrdFile et al need url too
- pass mets_server_url from run_processor
- typo: -{,-}mets-server-url
- move ClientSideOcrd{Agent,File} to ocrd_models
- METS server: support -U for processor options
- workspace server start: pass workspace context
- mets server: single option --mets-server-url/-U
- mets server will never pass content to workspace.add_file
- mets server: no content will pass through it
- mets server: clean up is_remote muddle
- mets server: support unique_identifier
- mets server: str handlers
- mets server: provide fallback for non-wrapped OcrdFile methods
- mets server: remove XXX HACK comments, they are not;
- mets server: improve docs
- mets server: add stop
- Update ocrd/ocrd/cli/workspace.py
- ocrd workspace CLI: reference METS server option
- METS server: consistently use local_filename
- Update ocrd/ocrd/cli/workspace.py
- METS Server: equivalent functionality to files for agents
- finish implementation / test mets server
- Merge remote-tracking branch 'origin/master' into mets-server
- workspace: save content to file only if not remote
- Merge branch 'mets-server' of https://github.com/kba/ocrd-core into mets-server
- mets_server: file search/adding on /file not /
- mets_server: missed mimetype kwarg
- mets_server: different loggers for socket/host-port
- Merge branch 'mets-server' of https://github.com/kba/ocrd-core into mets-server
- mets_server: replace Model constructor with static create calls
- --port must be int
- resolver: shorten mets_server_{host,port} check
- mets_server: only save_mets on PUT and DELETE
- OcrdWorkspace.is_remote should be a bool
- ClientSideOcrdMets: fix signature of self.file_groups
- mets-server: bashlib should take same args
- remove noise from makefile
- slowly but determinedly
- getting there
- .
- wip
dinglehopper 0fd4ea1..dbaccdd
Release: v0.9.4
- ✒ README: Minor whitespace cleanup
- ✒ README: Recommend installing via pip and from PyPI
- 📦 v0.9.4
- 🎨 editorconfig: *.json should have a final newline
- 🧹 pyproject: Remove extra *.json
- 🧹 Remove empty setup.cfg
- 📦 v0.9.3
- 🐛 Remove MANIFEST.in workaround, now that setuptools_ocrd is fixed
- 📦 v0.9.2
- 🧹 .gitignore dist/
- 🐛 Workaround sdist not containing top-level ocrd-tool.json
- ⚙ GitHub Actions: Call test workflow when (before) deploying
- 🎨 Release: Make installing setuptools-ocrd conditional on ocrd-tool.json
- 🐛 Release: Try fixing getting the version (install setuptools-ocrd)
- 📦 v0.9.1
- ✒ README: Update badges
- Revert "🚧 GitHub Actions: Try testing on Python 3.12"
- 🚧 GitHub Actions: Try testing on Python 3.12
- 🚧 GitHub Actions/CircleCI: Remove testing from CircleCI config
- 🚧 GitHub Actions: Do no try installing ruff on Python 3.6
- 🚧 GitHub Actions: Do no try installing pytest-ruff on Python 3.6
- 🚧 GitHub Actions: Avoid compiling OpenCV and NumPy on Python 3.6
- 🚧 GitHub Actions: Fix testing for Python 3.6
- 🚧 GitHub Actions: Disable matrix fail-fast
- 🚧 GitHub Actions: Test on multiple Python versions
- 🚧 GitHub Actions: Test report
- 🚧 GitHub Actions: Try shell for loop to install from all requirements*.txt
- 🚧 GitHub Actions: Rework test, run in src/
- 🚧 GitHub Actions: Allow running test manually
- 🚧 GitHub Actions: Rename test workflow, also run on schedule
- 🚧 GitHub Actions: Add test worklow
- 🚧 GitHub Actions: Add release workflow
- 🧹 Make dinglehopper.* exports explicit
- ⚙ ruff: Ignore F811 (no redefinitions) for now, as ruff considers the multimethods redefinitions
- 🎨 Reformat comments + strings manually (not auto-fixed by Black)
- ⬆ Use f-strings
- 🎨 Reformat using Black
- 🎨 Sort imports (auto-fixed by ruff)
- ⚙ Add pre-commit
- 🛠 Replace flake8 + pylint with ruff
- ⚙ Move mypy settings to pyproject.toml
- ⚙ pytest.ini → pyproject.toml
- 🐛 Detect encoding (incl BOM) when reading files
- 🐛 Move source into src/ to fix install
- ⚙ Migrate to pyproject.toml
- 🚧 CircleCI: Run black
- Merge pull request #83 from INL/feat/batch-processing
- Merge pull request #82 from CircleCI-config-suggestions-bot/StoreTestResults
- 🧹 .gitignore .python-version (for pyenv)
- 🧹 Remove qurator. namespace prefix
- 🐛 Fix installing by calling find_namespace_packages in setup.py
- 🕸Do not use deprecated ID, pageId options
- 🔧 Remove explicit namespace_packages
- ✔ CircleCI: Explicitly install binary opencv-python-headless (dep of OCR-D?) to avoid compilation
- 🐛 Remove deprecated declare_namespace call
nmalign cf7c60f..7832c90
Release: v0.0.3
- adapt to Numpy deprecations
ocrd_calamari 3a029ca..c0a4dfd
Release: v1.0.6
- Merge pull request #90 from OCR-D/tf_disable_interactive_logs
- ✒ README: Use backtick syntax for code block
- ✔ CircleCI: Do not test on Python 3.6 anymore (EOL since 2021-12-23)
- v1.0.6
- 🐛 Fix installation by keeping protobuf < 4.0
ocrd_cis 43a356a..fcc02fd
Release: v0.1.5
- adapt to Numpy and Pillow deprecations
- segment: fix baseline extraction
- segment: adapt to OpenCV changes
- resegment (baseline/ccomps): improve handling of fg conflicts
- resegment: add param baseline_only
- check_page/region/line: skip assumptions on number of components
- adapt to Shapely 2.0 deprecations
...
v2023-06-28
Unreleased
v2023-06-28
Changed:
- Bash prompt excludes user name undefined in docker container, #376, #366
- Docker:
ocrd-all-tool.json
is built during container build, #379 - Docker:
XDG_CONFIG_HOME
is set toXDG_DATA_HOME/ocrd-resources
, soresources.yml
is in usually-mounted location, #377, #252 - Docker:
/data
is world-writeable now, so log files can be written there, #377, #252
core 6708624..552cfcd
Release: v2.52.0
- Dockerfile: install wheel before make install
- 📦 v2.52.0
- 📝 changelog
- ci: debug macos
- Makefile: PIP_INSTALL from environment via ?=
- ci: try fixing macos
- Merge branch 'master' into improve-packaging
- Makefile: trailing whitespace
- update gitignore
- add ocrd.processor.builtin.dummy pkg (needed for resource discovery)
- add requirements.txt to manifest (so it's available at build time)
- make pypi: use build module instead of setuptools CLI
- make uninstall: run in reverse BUILD_ORDER
- make install: run conjunctively in BUILD_ORDER
ocrd_calamari 3a029ca..ed7a926
Release: v1.0.6
- v1.0.6
- 🐛 Fix installation by keeping protobuf < 4.0
ocrd_cis a0ea0a2..43a356a
Release: v0.1.5
- postcorrect: improve/update OCR-D wrapper…
- ocropy-train: improve/update OCR-D wrapper…
- ocrd-tool: rm old ocrd-cis-ocropy-rec (gone in 9e20991)
Merge branch 'kba:typo' #91 into fix-alpha-shape
Merge branch 'kba:double-page-max-size' #96 into fix-alpha-shape
Merge branch 'kba:resolve-resources' #83 into fix-alpha-shape
ocrd_detectron2 04bf4c6..5f8bdcb
Release: v0.1.7
- update model URLs to GH release archive
- requirements.txt: avoid pkg names in comments
- requirements.txt: add pycocotools as explicit dependency
- require wheel (since pip does not pull it anymore)
- deps: no build isolation so Detectron2 compilation can use Torch
- Docker: add badges and basic description
- Docker username: vars, not env
- Docker: use underscore in tagname, alright
- add Dockerfile and GH Action to publish at Dockerhub and GHCR
ocrd_kraken b13dd8a..1e71324
Release: v0.3.0
- recognize: ignore 'one_channel_mode' unless model has 1 input channel
- segment/recognize: warn if no GPU available
ocrd_repair_inconsistencies c898d6c..cf879c1
- Merge pull request #13 from stweil/master
opencv-python 474a1cc..b534ea2
Release: 72
- Merge pull request #853 from asmorkalov/as/add_pyi_to_package
sbb_binarization 010ec99..f3c6ac8
Release: v0.1.0
- Update test.yml
tesseract 1569e50..bb8803a
Release: 5.3.1
- Update .mailmap
- Create config.yml
- Remove old broken GitHub action vcpkg-4.1.1 (fixes issue #4078)
- cmake: check if leptonica was build with tiff support
- cmake: provide info about disabled LibArchive and CURL
- cmake: allow to disable tiff (-DDISABLE_TIFF=ON)
- Merge pull request #4073 from stweil/osd
- Merge pull request #4071 from stweil/clean
- Merge pull request #4066 from stweil/lstmtraining
- Merge pull request #4068 from stweil/sprintf
- Remove unused code in function fix_rep_char
- Merge pull request #4067 from stweil/misc
- Support for Sgaw and W Pwo Karen languages in the Myanmar validator. (#4065)
- issue-bug.yml: Windows versions 7, 8, 8.1 are not supported anymore
- snap: Update from leptonica 1.74.2 to latest 1.83.1
- fix: Fix snap package building
- Create new release 5.3.1
- Remove whitespace at line endings
- Fix issue #4010 (#4041)
- cmake: add missing HAVE_NEON to config_auto.h
- Merge branch 'main' of https://github.com/tesseract-ocr/tesseract
- cmake: adjust build to autotool settings
- Merge branch 'main' of https://github.com/tesseract-ocr/tesseract
- cmake: improve NEON build
tesserocr e184c62..3c9519b
Release: v2.6.0
- add github-workflow building wheels
v2023-06-14
Changed:
- All docker images now contain git checkouts and retain
/build
, i.e. behave like the-git
variants - No more git updates within docker build, but fix git module dependency outside
- Reduce docker image size (by reinstating all-in-one layer, removing cache, avoiding duplicate CUDA libraries...)
- Use
git submodule update --single-branch
on CI to reduce docker image size
Added:
make deps-cuda
: non-intrusively support CUDA system dependencies (in docker or native)make ocrd-all-tool.json
: Generate and upload a combination of all processors'ocrd-tool.json
, #362make test-workflow
: Run a workflow with most processors as a general smoke testmake test-cuda
: to test whether CUDA properly set up and has GPU availablemake test-core
: Run OCR-D/core unit tests
Fixed:
- dependencies between modules, esp. with custom
OCRD_MODULES
selection - editable mode (
pip install -e
) - OpenCV build
- get
tesserocr
from PyPI if disabled as a module - get
ocrd
from PyPI if core disabled as a module - consistent interoperable module versions (esp. Numpy/OpenCV/Shapely/Protobuf/Torch/TF Python dependencies)
cor-asv-ann 006a70e..2c4b1ff
Release: v0.1.14
- CI: use ocrd/core-cuda as base image
- CI: dummy venv
- CI: use proper tab character
- CI: clone first
- CI: mkdir first
- CI: chdir to tmp location
- CI: use /tmp for aux clone of ocrd_all
- try getting tensorflow-gpu from Nvidia
- use proper URLs for submodules
- Merge pull request #6 from kba/init-report-dict
- evaluate: skip pages with no results
core de08453..6708624
Release: v2.51.0
- Merge pull request #1055 from bertsky/deps-cuda
- ci: disable upterm for gh actions
- readme: remove dockerhub/travis badge, add GH actions badge
- debug gh actions
- test bashlib: /usr/bin/env bash instead of /bin/bash
- test_workspace_bagger: use ocr-d.de instead of google.com for testing
- disable logging tests until properly fixed
- docker-image: reuse local ghcr.io image instead of docker.io
- 📦 v2.51.0
- 📝 changelog
- make help: improve description
- Revert "Merge remote-tracking branch 'hnesk/no-more-pkg_resources' into release-2.36.0"
- remove out-dated processor resources
- docker-cuda: improve (reduce size) again…
- docker-cuda: rewrite…
- core-cuda: use same CUDA libs as needed for Torch anyway
- Merge branch 'pr-1008' into reduce-cuda
- Merge branch 'master' of https://github.com/OCR-D/core into reduce-cuda
- make install on py36: revert to prefer-binary via install
- make install on py36: fix prefer-binary syntax
- make install on py36: prefer binary OpenCV/Numpy via pip config instead of preinstall
- core-cuda: install more CUDA libs via pip and ld.so.conf, simplify Dockerfile for that
- core-cuda: use CUDA 11.8, install cuDNN via pip and make available system-wide via ld.so.conf
- reinstate workaround for shapely, but more robust
- docker-cuda: change base image, no multi-CUDA runtimes
- keep gcc, no autoremove
- rehash after pip upgrade
- give up workaround for shapely-CUDA issue
dinglehopper 0fd4ea1..35be58c
- Merge pull request #83 from INL/feat/batch-processing
- Merge pull request #82 from CircleCI-config-suggestions-bot/StoreTestResults
- 🧹 .gitignore .python-version (for pyenv)
- 🧹 Remove qurator. namespace prefix
- 🐛 Fix installing by calling find_namespace_packages in setup.py
- 🕸Do not use deprecated ID, pageId options
- 🔧 Remove explicit namespace_packages
- ✔ CircleCI: Explicitly install binary opencv-python-headless (dep of OCR-D?) to avoid compilation
- 🐛 Remove deprecated declare_namespace call
eynollah ea792d1..706433c
Release: v0.2.0
ocrd_cis c90b29f..a0ea0a2
Release: v0.1.5
- Merge branch 'kba:typo' #91 into fix-alpha-shape
- Merge branch 'kba:double-page-max-size' #96 into fix-alpha-shape
- Merge branch 'kba:resolve-resources' #83 into fix-alpha-shape
- segment: adapt to OpenCV changes
- resegment (baseline/ccomps): improve handling of fg conflicts
- resegment: add param baseline_only
- check_page/region/line: skip assumptions on number of components
- adapt to Shapely 2.0 deprecations
- adapt to Numpy 1.24 dtypes
- resegment: list instead of generator
- re/segment: improve polygon simplification
- re/segment: join_baselines: skip lines outside of polygon
- re/segment: join_baselines: for complex subtypes, apply recursively
- re/segment: join_polygons: connect touching neighbours, too
ocrd_fileformat dacfa50..4e7e0de
Release: v0.7.0
- 📦 v0.7.0
- update ocr-fileformat
ocrd_kraken 802c6b0..b13dd8a
Release: v0.3.0
- segment/recognize: default to device=cuda:0 (now backed by safe fall-back)
- segment/recognize: fall back to CPU if no CUDA device
- fix typo
- update changelog
- recognize: project text upwards in order by concatenation
- recognize: ensure baseline/boundary are consistent
- recognize: ignore invalid baselines
- setup metadata: update/improve
- deps-ubuntu: update
- improve/update readme
- Dockerfile: use CUDA base image, improve labels
- update changelog
- recognize: pass lines in baseline format if any baselines are annotated
- update blla.model URL (master→main)
- recognize: workaround for empty/failed line records
- recognize: workaround for better quality box cuts
- recognize: avoid invalid polygons on single-glyph words
- Revert "recognize: avoid invalid polygons on single-glyph words"
- segment: also show tags/type prediction
- recognize: avoid invalid polygons on single-glyph words
- recognize: use proper data structures of rpred
ocrd_pagetopdf 6155605..4f4a330
Release: v1.0.0
- Merge pull request #22 from bertsky/fix-input-files
ocrd_wrap 63c04d5..2cd800d
Release: v0.1.8
- 📦 0.1.8
- Merge pull request #10 from bertsky/update-numpy
opencv-python 6b73d90..474a1cc
Release: 72
- Merge pull request #849 from asmorkalov/as/python3_for_build
- Fix: numpy version for python 3.11 (#839)
- Merge pull request #852 from asmorkalov:as/ci_check
- Merge pull request #837 from bertsky/fix-py38-build
- Merge pull request #838 from henryiii/patch-2
sbb_binarization 39ef3fd..010ec99
Release: v0.1.0
workflow-configuration cb923f7..5aff777
- ocrd-import: add option --regex (positive path selector)
- ocrd-import: fix skipping in subshell
- add METS transforms to TOC
- generalise standalone CLI for both PAGE and METS XSL, update documentation
- mets-copy-agents.xsl: make path for other-mets relative to input m...
v2023-03-26
core cbe83ab..de08453
Release: v2.49.0
- :package v2.49.0
- 📝 changelog
- drop eynollah model from resource list, provided by eynollah itself
- 📝 changelog
- rename Docker image (to make work with GHCR)
- 📦 v2.48.1
- 📝 changelog
- core-cuda: CUDA 11.3 instead of 11.2
- ocrd_tool_validator: fix link in comment, fix #1019
- 📦 v2.48.0
- 📝 changelog
- Merge remote-tracking branch 'origin/build36-speedup-without-update'
- 📝 changelog
- Merge branch 'master' into fix-972
- Set ws outside the constructor
- chdir to ws in the beginning
- Undo the revert of getcwd()
- Revert getcwd() location - failing tests
- Raise the error instead of returning it
- Change getcwd() call location
- Chdir before processor.process() calls
- Fix the instance caching
- 📦 v2.47.4
- 📝 changelog
- Merge pull request #1011 from OCR-D/drop-fontgroup-resources
- 📦 v2.47.3
- 📝 changelog
- Dockerfile: reintroduce python3-pip so "pip install -U pip" succeeds again
- 📦 v2.47.2
- 📝 changelog
- Merge branch 'pr/986'
- 📦 v2.47.1
- 📝 changelog
- Merge pull request #1004 from stweil/fix-dockerfile
- Merge pull request #1003 from OCR-D/fix-docker-venv
- 📦 v2.47.0
- require importlib_resources for python <= 3.8, #996
- 📝 changelog
- resmgr: drop anybaseocr models from resource list
- 📝 changelog
- Merge branch 'pr/999'
- 📝 changelog
- ocrd_utils: adapt to newer importlib.resources API
- Merge pull request #994 from OCR-D/fix-scrutinizer
- 📝 changelog
- Merge pull request #977 from OCR-D/fix-add-agent
- 📝 changelog
- Merge branch 'pr/993'
- 📝 changelog
- Merge pull request #991 from bertsky/fix-resmgr-download-mimetype
- 📝 changelog
- Merge pull request #985 from OCR-D/fix-917
- 📦 v2.46.0
- 📝 changelog
- Merge branch 'drop-3.6'
- 📝 changelog
- Merge branch 'drop-mime-magic'
- 📝 changelog
- Merge branch 'pr/980'
- 📝 changelog
- Merge branch 'pr/981'
- 📝 changelog
- Merge pull request #972 from OCR-D/ref-processor-helper
- 📝 changelog
- Merge pull request #979 from OCR-D/workspace-validator-empty-pageid
- 📝 changelog
- Merge pull request #978 from OCR-D/bashlib-inputfiles
dinglehopper c4ab7c9..0fd4ea1 (rewind)
- ✔ Add @cneud's former 40 GB problem files to the test suite
- 🎨 Reformat using Black
- ✔ CircleCI: Test on Python 3.11
eynollah 13bc237..ea792d1
Release: v0.2.0
ocrd_anybaseocr 94e5037..5978a1f
Release: v1.9.0
ocrd_calamari c7ad6eb..3a029ca
Release: v1.0.5
- ✔ Do not test on Python 3.11 for now (unsupported)
- ✔ CircleCI: Install TF by explicitly invoking pip on Py 3.11
- ✔ Do not delete test workspace when DEBUG env variable is set
- ✔ CircleCI: Test on Python 3.11, too
- ✔ CircleCI: Install binary OpenCV for Python 3.6
- Revert "✔ CircleCI: Do not test on Python 3.6 anymore"
- ✔ CircleCI: Do not test on Python 3.6 anymore
- ✔ Fix tests to use the new filenames
- 🐛 Fix NumPy dependency (hopefully...)
- 🐛 Fix syntax error in setup.py
- 🐛 Require NumPy < 1.24 due to np.str deprecation/error
ocrd_detectron2 e005d3c..04bf4c6
Release: v0.1.7
- publish: commit images, too
- CI: publish test results to gh-pages
- add link for test results
- Delete jekyll-gh-pages.yml
- try to fix gh-pages
- add GH pages
- doc: add CI badge
- CI: use cache instead of artifacts for models
- CI: try to get the damn conditional to work
- CI: use stupid GHA negation syntax
- CI: fix action URL
- CI: fix negation
- CI: cache detectron models via artifacts
- Merge branch 'master' of ssh://github.com/bertsky/ocrd_detectron2
- some workarounds for broken model configs
- 📦 0.1.7
- add CLI test
- add models for magazine layout by Jambo-sudo (PubLayNet+custom GT) and LayoutParser (PRImA Layout GT)
- adapt to numpy v1.24
- 📦 0.1.6
- add fixture for badly written config files (base path)
- add models for table detection with Psarpei/Multi-Type-TD-TSR
- ocrd-tool resources: update/fix
- make deps: add torch deps explicitly
- avoid colon in generated region IDs
ocrd_fileformat 5022408..dacfa50
Release: v0.6.2
- 📦 v0.6.2
- 📝 changelog
- Merge pull request #44 from bertsky/patch-3
ocrd_keraslm 787341d..9c50478
Release: v0.4.1
- deps: hold numpy and h5py
ocrd_tesserocr 515be8d..09d1e13
Release: v0.17.0
- try with lowercase image tag
- rename Docker image
- Update docker-image.yml
- Create docker-image.yml
- 📦 0.17.0
- Merge pull request #191 from bertsky/override-insteadof-wrap
- deps-ubuntu: allow PPA to fail (on newer distributions)
- CI: chmod PPA Tesseract tessdata g+w
- deps-ubuntu: allow PPA to fail (on newer distributions)
- CI: fix conditional step syntax
- CI: seems to require sudo
- CI: speedup Py36 deps
- CI: update+simplify (on cimg/python:* instead of ocrd/core)
ocrd_typegroups_classifier ffa40fc..a78a85f
Release: v0.5.0
opencv-python 736b905..6b73d90
Release: 72
- Merge pull request #820 from asmorkalov/as/config_py_path
- Merge pull request #803 from asmorkalov/as/license
- Merge pull request #790 from asmorkalov/as/docs_update
- Merge pull request #787 from asmorkalov/as/migrate_mac_m1
- Merge pull request #744 from peter-kovacs-aimotive/add-vulkan-license
- Merge pull request #768 from AlexeySalmin/patch-1
- Merge pull request #776 from TheCleric/fix/numpy_version_mac
sbb_binarization aeb6804..39ef3fd
Release: v0.0.11
- Merge pull request #54 from qurator-spk/circleci-python37_38
tesseract a6e0aa7..1569e50
Release: 5.3.0
- textord: Catch empty rows in block iterator (fixes #4039)
- cmake: sync with autotools (OPENMP_SIMD...
v2023-02-06
core 6331433..ee92cfc
Release: 2.45.1
- 📦 v2.45.1
- 📝 changelog
- resmgr: insert from tool instead of append
ocrd_detectron2 fde2f3c..f3342a4
Release: 0.1.5
- 📦 0.1.5
- fix debug_img indentation (only once per page/table)
- ocrd-tool.json: fix PubLayNet/jpleorx model specs
ocrd_olahd_client 10d70a9..6bcbb4b
Release: v0.0.2
- 📦 v0.0.2
- 📝 changelog
- Merge branch 'manipulate-mets-agent'
opencv-python ede2269..736b905
Release: 68
- OpenCV package does not distribute zlib (#780)
- OpenCV 4.7.0 release preparation
- Merge pull request #756 from asmorkalov:as/pipelines_update_4.7
tesseract 6a21a74..4142b32
Release: 5.3.0
- Fix some whitespace issues in source code and text files
- Merge pull request #3992 from seupedro/patch-1
- fix "cannot pass non-trivial object of type 'std::string'"
- show out filename on successful created of traineddata (combine_lang_model)
- fix "cannot pass non-trivial object of type 'std::string'"
- unicharset_extractor: - run ReadMemBoxes only for box files - do not write unicharset in case of broken box file
- Update issue-bug.yml
- Create an issue template for a feature request
- Create a new issue template
- Create new release 5.3.0
- Update README.md
- cmake - msvc/openmp: clean&document configuration
- cmake - mscvc: silent warning C4068: unknown pragma 'GCC'
- Create new release 5.3.0-rc1
- Replace MacOS -> macOS