Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importlib.metadata.Distribution.files empty for editable install #96144

Closed
mattefowler opened this issue Aug 20, 2022 · 9 comments
Closed

importlib.metadata.Distribution.files empty for editable install #96144

mattefowler opened this issue Aug 20, 2022 · 9 comments
Assignees

Comments

@mattefowler
Copy link

mattefowler commented Aug 20, 2022

Bug report

when installing a package as editable, importlib.metadata.Distribution.files returns only the metadata for the install, and not the actual files of the distribution.

notably, this does not occur if the package location is separately added to $PYTHONPATH or see you soon.path.

install any package with -e

obtain the importlib.metadata.Distribution for that package.

enumerate the files property.

e.g:
given a test project set up as:

src
└── pkg
    ├── setup.py
    └── pkg
        └── pymodule.py

that has been editably-installed, e.g. from src - pip install -e pkg then attempt to discover the modules of the package like so:

from importlib.metadata import Distribution
from pprint import pprint

pprint({d.name: d.files for d in Distribution.discover()})

Output

{'editable-install-package-test': [PackagePath('setup.py'),
                                   PackagePath('editable_install_package_test.egg-info/PKG-INFO'),
                                   PackagePath('editable_install_package_test.egg-info/SOURCES.txt'),
                                   PackagePath('editable_install_package_test.egg-info/dependency_links.txt'),
                                   PackagePath('editable_install_package_test.egg-info/top_level.txt')],
#...
}

note no modules from pkg are listed.

Your environment

  • CPython versions tested on: 3.10.4, 3.10.5
  • Operating system and architecture: macOS 11.6, Ubuntu 20.04.4
@mattefowler mattefowler added the type-bug An unexpected behavior, bug, or error label Aug 20, 2022
@tirkarthi
Copy link
Member

I believe this should be reported in https://github.com/python/importlib_metadata and below seem to be related to this issue. cc: @jaraco

pypa/packaging-problems#609
python/importlib_metadata#402
python/importlib_metadata#402

@jaraco jaraco self-assigned this Sep 5, 2022
@jaraco
Copy link
Member

jaraco commented Sep 5, 2022

I'm not able to replicate this issue on Python 3.11rc1 (and pylauncher):

 draft $ git clone -q gh://pypa/sampleproject
 draft $ cd sampleproject
 sampleproject main $ py -m venv .venv
 sampleproject main $ py -m pip uninstall -y -q setuptools
 sampleproject main $ py -m pip install -e .
Obtaining file:///Users/jaraco/draft/sampleproject
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... done
  Preparing editable metadata (pyproject.toml) ... done
Collecting peppercorn
  Using cached peppercorn-0.6-py3-none-any.whl (4.8 kB)
Building wheels for collected packages: sampleproject
  Building editable for sampleproject (pyproject.toml) ... done
  Created wheel for sampleproject: filename=sampleproject-2.0.0-0.editable-py3-none-any.whl size=3620 sha256=4696e44c9b167004826edc9f45a0a4fd5e73e6b7f41b304f36d11e041bfff1ad
  Stored in directory: /private/var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-ephem-wheel-cache-ayzft_i8/wheels/b3/64/f7/8fd2c4f9ec0e108f3635016ba712ae6ed605348d989214ae48
Successfully built sampleproject
Installing collected packages: peppercorn, sampleproject
Successfully installed peppercorn-0.6 sampleproject-2.0.0
 sampleproject main $ py -c "import importlib.metadata as metadata, pprint; pprint.pprint({d.name: d.files for d in metadata.distributions() if d.name == 'sampleproject'})"
{'sampleproject': [PackagePath('LICENSE.txt'),
                   PackagePath('README.md'),
                   PackagePath('pyproject.toml'),
                   PackagePath('setup.cfg'),
                   PackagePath('setup.py'),
                   PackagePath('data/data_file'),
                   PackagePath('src/sample/__init__.py'),
                   PackagePath('src/sample/package_data.dat'),
                   PackagePath('src/sample/simple.py'),
                   PackagePath('src/sampleproject.egg-info/PKG-INFO'),
                   PackagePath('src/sampleproject.egg-info/SOURCES.txt'),
                   PackagePath('src/sampleproject.egg-info/dependency_links.txt'),
                   PackagePath('src/sampleproject.egg-info/entry_points.txt'),
                   PackagePath('src/sampleproject.egg-info/requires.txt'),
                   PackagePath('src/sampleproject.egg-info/top_level.txt')]}

The output is neither empty nor is it missing files from the distribution.

Since the report says that the output should be empty for any project installed as editable, I believe the report is invalid as described.

Although I believe the output you described above, it doesn't describe a situation where I can replicate the issue (I don't have the contents of setup.py).

If this issue is still reproducible for you, could you put together a repository or zip of the project where the issue occurs?

@jaraco jaraco added pending The issue will be closed if no feedback is provided and removed type-bug An unexpected behavior, bug, or error labels Sep 5, 2022
@mattefowler
Copy link
Author

mattefowler commented Sep 6, 2022

the details of the repo case are described in the issue; it has been attached as a zip for your convenience at your request. it requires creating a setup.py file and at least 1 module inside the package to be installed.

one possibly relevant detail that is only implicitly stated is the use of namespace packages - your repro includes an __init__.py where this does not.

also noted in the issue, I am not using python 3.11, but python 3.10.5

the title is not entirely accurate, but I don't think that makes the issue 'invalid' hopefully you will be able to piece together what you require. if there is additional supplemental data required please let me know.

pkg.zip

@jaraco
Copy link
Member

jaraco commented Sep 9, 2022

Oh, this is interesting. Using your repro, I can in fact replicate the issue locally.

After pip installing the package as editable using the latest setuptools, I see that files returns different things depending on whether the package was queried by distributions() or distribution(<name>):

 pkg $ py -c "import importlib.metadata as md, pprint; dist = md.distribution('editable-install-package-test'); print(dist.name); pprint.pprint(dist.files)"
editable-install-package-test
[PackagePath('setup.py'),
 PackagePath('editable_install_package_test.egg-info/PKG-INFO'),
 PackagePath('editable_install_package_test.egg-info/SOURCES.txt'),
 PackagePath('editable_install_package_test.egg-info/dependency_links.txt'),
 PackagePath('editable_install_package_test.egg-info/top_level.txt'),
 PackagePath('pkg/pymodule.py'),
 PackagePath('pkg/test.py')]
 pkg $ py -c "import importlib.metadata as metadata, pprint; pprint.pprint({d.name: d.files for d in metadata.distributions()})"
{'editable-install-package-test': [PackagePath('__editable__.editable_install_package_test-1.0.0.pth'),
                                   PackagePath('__editable___editable_install_package_test_1_0_0_finder.py'),
                                   PackagePath('__pycache__/__editable___editable_install_package_test_1_0_0_finder.cpython-311.pyc'),
                                   PackagePath('editable_install_package_test-1.0.0.dist-info/INSTALLER'),
                                   PackagePath('editable_install_package_test-1.0.0.dist-info/METADATA'),
                                   PackagePath('editable_install_package_test-1.0.0.dist-info/RECORD'),
                                   PackagePath('editable_install_package_test-1.0.0.dist-info/REQUESTED'),
                                   PackagePath('editable_install_package_test-1.0.0.dist-info/WHEEL'),
                                   PackagePath('editable_install_package_test-1.0.0.dist-info/direct_url.json'),
                                   PackagePath('editable_install_package_test-1.0.0.dist-info/top_level.txt')],
 'pip': [PackagePath('../../../bin/pip'),
...
         PackagePath('pip/py.typed')]}

I can see that in one case, the metadata is coming from the egg-info, and in the other, from a dist-info. The latter appears to be generated by a virtual package generated for editable installs, whereas the former appears to be coming from the .egg-info.

I'll need to do some more investigation.

@jaraco jaraco removed the pending The issue will be closed if no feedback is provided label Sep 9, 2022
@jaraco
Copy link
Member

jaraco commented Sep 10, 2022

I created this dockerfile to easily replicate the issue:

FROM jaraco/multipy-tox
RUN apt install -y libarchive-tools
RUN pipx install httpie
RUN http --follow GET https://github.com/python/cpython/files/9497942/pkg.zip | bsdtar x
WORKDIR pkg
RUN py -m venv .venv
RUN py -m pip uninstall -y setuptools
RUN py -m pip install -e .
CMD py -c "import importlib.metadata as metadata, pprint; pprint.pprint({d.name: d.files for d in metadata.distributions() if d.name.startswith('editable')})"

@jaraco
Copy link
Member

jaraco commented Sep 10, 2022

Aha! The use of the dict comprehension is what is masking the presence of two metadata implementations for editable-install-package-test:

 draft $ docker run -it @$(docker build -q .) py
Python 3.11.0rc1 (main, Aug  8 2022, 18:31:02) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import importlib.metadata as md
>>> list(md.distributions(name='editable-install-package-test'))
[<importlib.metadata.PathDistribution object at 0xffff85f55690>, <importlib.metadata.PathDistribution object at 0xffff85f56210>]
>>> first, last = _
>>> first._path
PosixPath('editable_install_package_test.egg-info')
>>> last._path
PosixPath('/pkg/.venv/lib/python3.11/site-packages/editable_install_package_test-1.0.0.dist-info')
>>> print(last._path.joinpath('RECORD').read_text())
__editable__.editable_install_package_test-1.0.0.pth,sha256=RfNM_PnQ7Q7oymwDD9UoSNhXmK2P1ma_ESogUdx3cnU,129
__editable___editable_install_package_test_1_0_0_finder.py,sha256=nnZGfSGFb7yyQL-TiVtZLRe-oRpVlvEA-ykzD5-VZRE,2426
__pycache__/__editable___editable_install_package_test_1_0_0_finder.cpython-311.pyc,,
editable_install_package_test-1.0.0.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
editable_install_package_test-1.0.0.dist-info/METADATA,sha256=wKEf02jLQlTmMc4ho2M-Mqget_eTeP3Bf_nlj9UJhuc,74
editable_install_package_test-1.0.0.dist-info/RECORD,,
editable_install_package_test-1.0.0.dist-info/REQUESTED,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
editable_install_package_test-1.0.0.dist-info/WHEEL,sha256=G16H4A3IeoQmnOrYV4ueZGKSjhipXx8zc8nu9FGlvMA,92
editable_install_package_test-1.0.0.dist-info/direct_url.json,sha256=I_eYQVMngRISOFe9bpo0tUdlBlup4ztzU98zXspoXMA,54
editable_install_package_test-1.0.0.dist-info/top_level.txt,sha256=8jjfKuFvlaNGG7JiuNtS31gIuwOm8thUcUQoNbsxxls,4

That explains why there are different values coming from different calls. distribution('editable-install-package-test') returns the metadata for the first match, whereas the dict comprehension for distributions() will resolve the last one found for that name.

Moreover, that also explains why the files are missing from the second metadata found. The metadata that pip/setuptools installs for that editable package doesn't include the list of files "installed" (the RECORD file only includes the metadata and finder hook).

In that sense, importlib metadata is doing the "right" thing here (reflecting the reality of the situation, if somewhat awkwardly), but the PyPA is going to need to figure out what needs to happen to generate the correct metadata for an editable-installed package.

There are a few questions I'd like to answer:

  1. What is generating the .egg-info metadata and why is it necessary (why if an isolated editable install is being generated does egg-info get generated at all)?
  2. Why does the dist-info not contain modules/packages from the package? Is that by design? I guess so, because pip would expect to be able to delete those files on an uninstall, so if setuptools were to include those files in the RECORD, that could end up deleting a user's source code (possibly the sole copy).
  3. Does this issue affect flit? That is, if a flit package is editable-installed, where is its metadata found and does it include a list of files?

It's quite possible because of (2) that the metadata specification is going to need to be expanded to support "files linked to but not installed by this package" or possibly separate the editable shim metadata from the metadata for the package under development.

cc: @abravalheri, @pfmoore; If you have some time to review the findings above, I'd be interested in your insights and links to any known issues relating to these concerns or other considerations you may have.

@abravalheri
Copy link

Hi @jaraco,

What is generating the .egg-info metadata and why is it necessary (why if an isolated editable install is being generated does egg-info get generated at all)?

This is a tricky subject. Generally, I have the impression that setuptools implementation for the build_meta "bleeds" the .egg-info directory: there are many calls to egg-info (to find the requirements, to create metadata, to generate the manifest, ...) and the egg_base is not explicitly set to a temporary directory most of the times.

The sdist and bdist_wheel commands themselves will run the egg_info command again if they don't find the .egg-info folder. The impact of this is even worse: egg_base is set when running the prepare_metadata_... hooks (pre 660, post 660), but then this setting is not preserved when the other PEP 517 hooks run (each hook runs in a fresh subprocess).

When implementing the PEP 660, I tried to improve this behaviour. We can see that when running pip -v install -e . --no-build-isolation --use-pep517, no .egg-info folder is created.

However, get_requires_for_build_editable is delegated to get_requires_for_build_wheel, and this is what creates the .egg-info in the CWD.

Why does the dist-info not contain modules/packages from the package? Is that by design?

My understanding from the specs is that RECORD should contain files inside the wheel archive, which is not the case for the modules and packages in an editable install.

Does this issue affect flit? That is, if a flit package is editable-installed, where is its metadata found and does it include a list of files?

If we take the example of pypa/installer (which uses flit) and perform an editable install, the module files will not be listed in RECORD (similar behaviour to setuptools). The files property should return something similar to the following:

>>> from pprint import pprint
>>> import importlib.metadata as md
>>> installer = next(md.distributions(name='installer'))
>>> pprint(installer.files)
[PackagePath('installer-0.6.0.dev0.dist-info/INSTALLER'),
 PackagePath('installer-0.6.0.dev0.dist-info/LICENSE'),
 PackagePath('installer-0.6.0.dev0.dist-info/METADATA'),
 PackagePath('installer-0.6.0.dev0.dist-info/RECORD'),
 PackagePath('installer-0.6.0.dev0.dist-info/REQUESTED'),
 PackagePath('installer-0.6.0.dev0.dist-info/WHEEL'),
 PackagePath('installer-0.6.0.dev0.dist-info/direct_url.json'),
 PackagePath('installer.pth')]
>>> installer._path
PosixPath('/tmp/venv/lib/python3.10/site-packages/installer-0.6.0.dev0.dist-info')

@abravalheri
Copy link

I believe that (right now) it is tricky for importlib.metadata and anyone using it to assume that there will only be one .dist-info/.egg-info directory in sys.path for a given distribution name.

Even if we re-implement get_requires_for_build_editable in a way that does not create an .egg-info folder, the PEP 517 hooks for building regular wheels and sdists will create an .egg-info directory in CWD.

@jaraco
Copy link
Member

jaraco commented Oct 16, 2022

Thanks @abravalheri. That's very useful. It clarifies that for editable installs, even for packages installed by flit (discarding any setuptools-specific issues), the RECORD file does not list the package modules. That leads me to believe this issue can't be solved by importlib metadata, but first needs to be solved by the PyPA - to design and then implement a spec for reporting the "files" of an editable install... or to declare that such behavior is unsupported. In the meantime, importlib metadata will continue to make a best effort to reflect the metadata that's present. I don't believe there's anything particularly wrong with the current implementation, other than it awkwardly and arbitrarily exposes different metadata depending on the technique used to consume it (dict comprehension vs list iteration). I plan to file an issue with pypa/packaging-problems to track the underlying need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants