New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: Use typing.IO for file streams #1498
Merged
Merged
Commits on Dec 12, 2022
-
DEV: Add in-project virtual envs to .gitignore
Many developers (like myself) like to use virtual environments included within the current project. These virtual environment are local development constructs and should not be checked into source control. This commit adds two common virtual environment directory names to the .gitignore to avoid accidental commits from future developers.
Configuration menu - View commit details
-
Copy full SHA for 3a1dcff - Browse repository at this point
Copy the full SHA 3a1dcffView commit details -
DEV: Include
pillow
inrequirements/dev.in
The current contribution instructions in `docs/dev/intro.md` direct new code contributors to install the `dev` requirements. After following that instruction, the minimal test suite fails with the following errors: ``` python -m venv .venv source .venv/bin/activate pip install -r requirements/dev.txt pytest -m "not external" -m "not samples" -m "not slow" ``` =================================================================================================== short test summary info ==================================================================================================== FAILED tests/test_reader.py::test_get_images[pdflatex-outline.pdf-expected_images0] - ModuleNotFoundError: No module named 'PIL' FAILED tests/test_reader.py::test_get_images[crazyones.pdf-expected_images1] - ModuleNotFoundError: No module named 'PIL' FAILED tests/test_reader.py::test_get_images[git.pdf-expected_images2] - ModuleNotFoundError: No module named 'PIL' FAILED tests/test_reader.py::test_get_images[imagemagick-CCITTFaxDecode.pdf-expected_images5] - ModuleNotFoundError: No module named 'PIL' FAILED tests/test_reader.py::test_get_images[src6-expected_images6] - ModuleNotFoundError: No module named 'PIL' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/994/994636.pdf-tika-994636.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/952/952133.pdf-tika-952133.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/914/914568.pdf-tika-914568.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/952/952016.pdf-tika-952016.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/965/965118.pdf-tika-952016.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/959/959184.pdf-tika-959184.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/958/958496.pdf-tika-958496.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/972/972174.pdf-tika-972174.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/972/972243.pdf-tika-972243.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/969/969502.pdf-tika-969502.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://arxiv.org/pdf/2201.00214.pdf-arxiv-2201.00214.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction_strict - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction2[https://corpora.tika.apache.org/base/docs/govdocs1/977/977609.pdf-tika-977609.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' ======================================================================= 18 failed, 536 passed, 5 skipped, 53 deselected, 5 xfailed in 146.94s (0:02:26) ======================================================================== This commit adds `pillow` to `requirements/dev.in` so that the minimal test suite can pass on the first try so that new code contributors can start implementing improvements with confidence.
Configuration menu - View commit details
-
Copy full SHA for 6b7e055 - Browse repository at this point
Copy the full SHA 6b7e055View commit details -
STY: Use official
IO
type for file streamsThe Python standard library provides the `IO` type for file streams. (Source: https://docs.python.org/3/library/typing.html#typing.IO) This commit replaces the complex Union type of the `IO` implementations with the official `IO` type. This will improve the accuracy of type checking in users' IDEs.
Configuration menu - View commit details
-
Copy full SHA for f25e817 - Browse repository at this point
Copy the full SHA f25e817View commit details -
STY: Use standard
IO
type hint for writersThe CI system flagged some additional conflicts with the `IO` type in the writer classes. This commit changes the writer classes to use the standard `IO` type instead of the union of IO implementations.
Configuration menu - View commit details
-
Copy full SHA for c9e7ec3 - Browse repository at this point
Copy the full SHA c9e7ec3View commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.