Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It seems pytorch broke something again, which broked poetry #6965

Closed
4 tasks done
MaKaNu opened this issue Nov 4, 2022 · 6 comments
Closed
4 tasks done

It seems pytorch broke something again, which broked poetry #6965

MaKaNu opened this issue Nov 4, 2022 · 6 comments
Labels
status/duplicate Duplicate issues

Comments

@MaKaNu
Copy link
Contributor

MaKaNu commented Nov 4, 2022

  • Poetry version: 1.2.2
  • Python version: 3.10.8
  • OS version and name: Windows 10
  • pyproject.toml: We use a fresh poetry setup for a new poetry project, the included steps to reproduce should lead to the same pyproject.toml
  • I am on the latest stable Poetry version, installed using a recommended method.
  • I have searched the issues of this repo and believe that this is not a duplicate.
  • I have consulted the FAQ and blog for any relevant entries or release notes.
  • If an exception occurs when executing a command, I executed it again in debug mode (-vvv option) and have included the output below.

Issue

We noticed yesterday 03.11.22 that we could not install pytorch on a new system, because one dependency could not be satisfied. It was nvidia-cudnn-cu11. So I reproduced by following steps the setup and ended up with the same issue:

$ poetry new test_torch_version_1.13.0
$ cd .\test_torch_version_1.13.0\
$ poetry add torch=1.13.0

which results always in the following error:

Creating virtualenv test-torch-version-1-13-0 in D:\Mitarbeiter\Kaupenjohann\15_python_ws\test_torch_version_1.13.0\.venv

Updating dependencies
Resolving dependencies...

Package operations: 6 installs, 1 update, 0 removals

  • Updating setuptools (65.3.0 -> 65.5.0)
  • Installing nvidia-cublas-cu11 (11.10.3.66)
  • Installing nvidia-cuda-nvrtc-cu11 (11.7.99)
  • Installing nvidia-cuda-runtime-cu11 (11.7.99)
  • Installing nvidia-cudnn-cu11 (8.5.0.96)
  • Installing typing-extensions (4.4.0)

  RuntimeError

  Unable to find installation candidates for nvidia-cudnn-cu11 (8.5.0.96)

  at ~\AppData\Roaming\pypoetry\venv\lib\site-packages\poetry\installation\chooser.py:103 in choose_for
       99│
      100│             links.append(link)
      101│
      102│         if not links:
    → 103│             raise RuntimeError(f"Unable to find installation candidates for {package}")
      104│
      105│         # Get the best link
      106│         chosen = max(links, key=lambda link: self._sort_key(package, link))
      107│

So we tinkered further and analyzed the poetry.lock and realized the dependencies for torch=1.13.0:

[package.dependencies]
nvidia-cublas-cu11 = "11.10.3.66"
nvidia-cuda-nvrtc-cu11 = "11.7.99"
nvidia-cuda-runtime-cu11 = "11.7.99"
nvidia-cudnn-cu11 = "8.5.0.96"
typing-extensions = "*"

Et voilá we found our troublemaker package. So we checked the metadata.files and realized:

[metadata.files]
nvidia-cudnn-cu11 = [
    {file = "nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl", hash = "sha256:402f40adfc6f418f9dae9ab402e773cfed9beae52333f6d86ae3107a1b9527e7"},
    {file = "nvidia_cudnn_cu11-8.5.0.96-py3-none-manylinux1_x86_64.whl", hash = "sha256:71f8111eb830879ff2836db3cccf03bbd735df9b0d17cd93761732ac50a8a108"},
]

only manylinux and no win_amd64 like the others. So we were curious. What are the dependencies for version 1.12.1?

poetry add torch=1.12.1

and surprise:

Updating dependencies
Resolving dependencies...

Writing lock file

Package operations: 1 install, 0 updates, 5 removals

  • Removing nvidia-cublas-cu11 (11.10.3.66)
  • Removing nvidia-cuda-nvrtc-cu11 (11.7.99)
  • Removing nvidia-cuda-runtime-cu11 (11.7.99)
  • Removing setuptools (65.5.0)
  • Removing wheel (0.37.1)
  • Installing torch (1.12.1)

the cuda dependencies are gone. We totally aware about that pytorch with cuda is a mess overall in poetry because:

linus-nvidia

But this problem appears also if pytorch-lightning or any other tool depends on pytorch and resolving such merge conflicts was absolutely catastrophic, since the newer version got autoresolved all the time...

@MaKaNu MaKaNu added kind/bug Something isn't working as expected status/triage This issue needs to be triaged labels Nov 4, 2022
@dimbleby
Copy link
Contributor

dimbleby commented Nov 4, 2022

seems like a duplicate of #6939 and, as there, not a poetry bug. Take it to pytorch!

@MaKaNu
Copy link
Contributor Author

MaKaNu commented Nov 4, 2022

It is not a direct duplicate of #6939 since I pointed down to the issue of torch directly. I see also a different problem here: Installing torch 1.13.0 directly with pip doesn't have any issue. This means the meta developer screwed it up and that might happen again. In the meantime for a dev it is ridiculous to figure out why the dependency of the package they wanted install rely on a dependency which has a new screwed up metafile version because it has a dependency which is not available on a specific platform.

If I just add a new package that might be fine, but if this happen while update takes huge amount of investigation to recognize the issue.

@neersighted
Copy link
Member

Poetry can't do anything about bad metadata from a package; you should pin to an old version of torch until this is resolved upstream.

@neersighted neersighted closed this as not planned Won't fix, can't repro, duplicate, stale Nov 4, 2022
@neersighted neersighted added status/duplicate Duplicate issues and removed kind/bug Something isn't working as expected status/triage This issue needs to be triaged labels Nov 4, 2022
@MaKaNu
Copy link
Contributor Author

MaKaNu commented Nov 4, 2022

I know that poetry can't do anything about bad metadata from a package, but maybe we can turn this into a feature that if bad metadata from a package appears, the error message includes also information about the complete dependency tree. This would help to analyze the issue faster. We were only able to figure this out because the huge time investment sweeping through the poerty.lock.

@neersighted
Copy link
Member

Detecting "bad" (in this case, mismatched) metadata is its own challenge; we will not have a robust and performant way to do so until PEP 658 is implemented by indexes. At that point, yes, we can fail with a accurate, detailed, and descriptive error message when this happens.

Copy link

github-actions bot commented Mar 1, 2024

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status/duplicate Duplicate issues
Projects
None yet
Development

No branches or pull requests

3 participants