Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading Python source files with importlib.resources #237

Open
layday opened this issue Apr 10, 2020 · 16 comments
Open

Reading Python source files with importlib.resources #237

layday opened this issue Apr 10, 2020 · 16 comments
Labels
enhancement New feature or request

Comments

@layday
Copy link

layday commented Apr 10, 2020

Unlike in CPython, it is not possible to read Python source files using importlib.resources from within PyOxidizer.

@indygreg
Copy link
Owner

This would be a very useful feature!

I'm thinking that we'll want to expose some control over this though. Here's what I'm thinking:

  • Expose a global boolean flag at packaging time to determine if sources are exposed as resources by default.
  • Add some kind of Starlark control for per-module/per-package toggling.
  • Maybe expose an API to Python itself to control this at run-time (I've been wanting to expose some of the importer's state to Python code to facilitate inspection, etc).

@layday
Copy link
Author

layday commented Sep 3, 2021

It's hard to tell if any of the new resource options have made this possible - could you confirm that Python source files are still invisible to importlib.resources?

@ofek
Copy link
Sponsor

ofek commented May 27, 2022

Is there a reason why we don't support this? https://docs.python.org/3/library/importlib.html#importlib.resources.files

@indygreg
Copy link
Owner

The Traversable interface has been on my radar for a few years. See https://gitlab.com/python-devs/importlib_resources/-/issues/90.

Since the API did ship, it should probably be implemented. Although I'm still not thrilled about the proliferation of resources APIs in the standard library and the fact that the Traversable interface is tightly coupled to behavior of filesystems. But that ship has sailed.

@ofek
Copy link
Sponsor

ofek commented Jun 1, 2022

In trying to make virtualenv compatible with PyOxidizer, it seems Python file support is required. Example: virtualenv.create.via_global_ref._virtualenv.py

I get: FileNotFoundError: resource not found

@indygreg
Copy link
Owner

indygreg commented Jun 2, 2022

Note that PyOxidizer has support for the Python 3.7 era importlib.resources APIs. The OxidizedFinder meta path importer does implement a get_resources_reader() that returns an importlib.abc.ResourceReader conforming object (https://docs.python.org/3.11/library/importlib.html#importlib.abc.ResourceReader).

What isn't yet implemented is the Traversable interface introduced in Python 3.9 (https://docs.python.org/3.11/library/importlib.html#importlib.abc.TraversableResources).

To be honest, I'm surprised people are running into problems here. Given that the Traversable interface and corresponding APIs like importlib.resources.files() were introduced in 3.9, I'd expect very few applications or libraries to be targeting 3.9+ exclusively. Most resource access code in the wild does feature detection and is able to fall back to older resources APIs if the newer ones aren't available.

It is possible some code somewhere (including in the Python standard library) is buggy and assumes presence of a 3.9 API (e.g. importlib.resources.files() implies support for these APIs on individual meta path importers. I could easily see how someone could test for hasattr(importlib.resources, "files") and assume all meta path importers support .files(). This behavior would be naive about the possibility of 3rd party meta path importers, such as OxidizedFinder.

@indygreg
Copy link
Owner

indygreg commented Jun 2, 2022

Looking through the code of CPython 3.11, it looks like ~everything in importlib.resources now goes through .files(). Fortunately, there is an importlib.resources._adapters module that purports to transparently expose the .files() + Traversable interface to meta path importers not supporting the 3.9 API. So this should all just work, even without us explicitly supporting the 3.9+ APIs.

@indygreg
Copy link
Owner

indygreg commented Jun 2, 2022

In trying to make virtualenv compatible with PyOxidizer, it seems Python file support is required. Example: virtualenv.create.via_global_ref._virtualenv.py

That is seemingly related to https://github.com/pypa/virtualenv/blob/a8fd40f6f6b1b20af1293c926a191d61ad4f5098/src/virtualenv/create/via_global_ref/_virtualenv.py#L7 and therefore #69. Virtualenv should ween off __file__, as the module attribute is defined as optional. https://pyoxidizer.readthedocs.io/en/stable/oxidized_importer_behavior_and_compliance.html#no-file and https://pyoxidizer.readthedocs.io/en/stable/oxidized_importer_resource_files.html document this extensively.

While I'm here, what does os.path.join(__file__) even do?! Why would you call os.path.join() with a single argument? I've never seen this pattern! Supposedly it does some kind of normalization. But reading the stdlib docs doesn't make it apparent to me! Could somebody please enlighten me?

@ofek
Copy link
Sponsor

ofek commented Jun 2, 2022

The failing code is importlib.resources.read_text('virtualenv.create.via_global_ref', '_virtualenv.py')

@indygreg
Copy link
Owner

indygreg commented Jun 2, 2022

The failing code is importlib.resources.read_text('virtualenv.create.via_global_ref', '_virtualenv.py')

In that case, it might be due to the fact that _virtualenv.py is a Python module and not a non-Python file resource. PyOxidizer treats these as separate resource types and may only expose non-module files to the resources APIs. But Python - where everything is just files under the hood - doesn't apply any distinction.

The solution for this is for PyOxidizer to expose Python modules to resources APIs. I thought I wrote patches for this already. But they may be languishing in my unmerged commit series to overhaul how resources work, as it is clear PyOxidizer's model isn't compatible with Python's view of the world. It should be possible to ship a quick fix that makes Python modules available via the resources APIs.

A workaround until that lands is to use the get_source() API for accessing the source code of Python modules. https://docs.python.org/3.11/library/importlib.resources.abc.html#InspectLoader.get_source. I'm unsure if there is a helper method in importlib for it. But you could do e.g. import virtualenv.create.via_global_ref as m; m.__loader__.get_source("_virtualenv"). That - or a similar variation - should work with oxidized_importer.

@ofek
Copy link
Sponsor

ofek commented Jun 2, 2022

It should be possible to ship a quick fix that makes Python modules available via the resources APIs.

I'd prefer to wait for that 🙂

@ofek
Copy link
Sponsor

ofek commented Jun 2, 2022

This should nearly work afterward: pypa/virtualenv@main...ofek:resources

@ofek
Copy link
Sponsor

ofek commented Jun 6, 2022

It should be possible to ship a quick fix that makes Python modules available via the resources APIs.

Anything I could do to help? Both CLIs I want to oxidize (Hatch for PyPA, another for work) both depend on virtualenv

@tsibley
Copy link
Contributor

tsibley commented Sep 12, 2022

I could easily see how someone could test for hasattr(importlib.resources, "files") and assume all meta path importers support .files(). This behavior would be naive about the possibility of 3rd party meta path importers, such as OxidizedFinder.

Just today I ran into this exact issue with the latest certifi release.

As a workaround, I'm planning to downgrade the Python version in my PyOxidizer-built executable from 3.10 to 3.8 to evade certifi's version detection.

Looking through the code of CPython 3.11, it looks like ~everything in importlib.resources now goes through .files(). Fortunately, there is an importlib.resources._adapters module that purports to transparently expose the .files() + Traversable interface to meta path importers not supporting the 3.9 API. So this should all just work, even without us explicitly supporting the 3.9+ APIs.

Unfortunately that adapter only exists on 3.11. On 3.9 and 3.10, trying to use the Traversable / files() API with an importer than doesn't support it will fail with either a TypeError or ValueError.

importlib_resources >=5.3.0 could fill that gap… but that's unlikely since it has to be used by everything doing importing.

@tsibley
Copy link
Contributor

tsibley commented Sep 12, 2022

(To be clear, the issues I mention aren't specific to the ostensible topic of this issue of reading bundled Python source files with importlib.metadata.)

@pquentin
Copy link

pquentin commented Nov 29, 2022

The adapter is in a different place for Python 3.10: https://github.com/python/cpython/blob/3.10/Lib/importlib/_adapters.py. And using the version of importlib_resources used at the time of Python 3.10.8 fixes jsonschema too, so something else is going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants