Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Use typing.IO for file streams #1498

Merged
merged 4 commits into from Dec 15, 2022
Merged

Commits on Dec 12, 2022

  1. DEV: Add in-project virtual envs to .gitignore

    Many developers (like myself) like to use virtual environments included within
    the current project. These virtual environment are local development constructs
    and should not be checked into source control.
    
    This commit adds two common virtual environment directory names to the
    .gitignore to avoid accidental commits from future developers.
    thehale committed Dec 12, 2022
    Configuration menu
    Copy the full SHA
    3a1dcff View commit details
    Browse the repository at this point in the history
  2. DEV: Include pillow in requirements/dev.in

    The current contribution instructions in `docs/dev/intro.md` direct new code
    contributors to install the `dev` requirements. After following that
    instruction, the minimal test suite fails with the following errors:
    
    ```
    python -m venv .venv
    source .venv/bin/activate
    pip install -r requirements/dev.txt
    pytest -m "not external" -m "not samples" -m "not slow"
    ```
    
    =================================================================================================== short test summary info ====================================================================================================
    FAILED tests/test_reader.py::test_get_images[pdflatex-outline.pdf-expected_images0] - ModuleNotFoundError: No module named 'PIL'
    FAILED tests/test_reader.py::test_get_images[crazyones.pdf-expected_images1] - ModuleNotFoundError: No module named 'PIL'
    FAILED tests/test_reader.py::test_get_images[git.pdf-expected_images2] - ModuleNotFoundError: No module named 'PIL'
    FAILED tests/test_reader.py::test_get_images[imagemagick-CCITTFaxDecode.pdf-expected_images5] - ModuleNotFoundError: No module named 'PIL'
    FAILED tests/test_reader.py::test_get_images[src6-expected_images6] - ModuleNotFoundError: No module named 'PIL'
    FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/994/994636.pdf-tika-994636.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/952/952133.pdf-tika-952133.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/914/914568.pdf-tika-914568.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/952/952016.pdf-tika-952016.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/965/965118.pdf-tika-952016.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/959/959184.pdf-tika-959184.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/958/958496.pdf-tika-958496.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/972/972174.pdf-tika-972174.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/972/972243.pdf-tika-972243.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/969/969502.pdf-tika-969502.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    FAILED tests/test_workflows.py::test_image_extraction[https://arxiv.org/pdf/2201.00214.pdf-arxiv-2201.00214.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    FAILED tests/test_workflows.py::test_image_extraction_strict - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    FAILED tests/test_workflows.py::test_image_extraction2[https://corpora.tika.apache.org/base/docs/govdocs1/977/977609.pdf-tika-977609.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]'
    ======================================================================= 18 failed, 536 passed, 5 skipped, 53 deselected, 5 xfailed in 146.94s (0:02:26) ========================================================================
    
    This commit adds `pillow` to  `requirements/dev.in` so that the minimal test
    suite can pass on the first try so that new code contributors can start
    implementing improvements with confidence.
    thehale committed Dec 12, 2022
    Configuration menu
    Copy the full SHA
    6b7e055 View commit details
    Browse the repository at this point in the history
  3. STY: Use official IO type for file streams

    The Python standard library provides the `IO` type for file streams. (Source:
    https://docs.python.org/3/library/typing.html#typing.IO)
    
    This commit replaces the complex Union type of the `IO` implementations with the
    official `IO` type. This will improve the accuracy of type checking in users'
    IDEs.
    thehale committed Dec 12, 2022
    Configuration menu
    Copy the full SHA
    f25e817 View commit details
    Browse the repository at this point in the history
  4. STY: Use standard IO type hint for writers

    The CI system flagged some additional conflicts with the `IO` type in the writer
    classes.
    
    This commit changes the writer classes to use the standard `IO` type instead of
    the union of IO implementations.
    thehale committed Dec 12, 2022
    Configuration menu
    Copy the full SHA
    c9e7ec3 View commit details
    Browse the repository at this point in the history