Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend-dependent PyTorch versions #6671

Closed
2 tasks done
BlueskyFR opened this issue Oct 1, 2022 · 19 comments
Closed
2 tasks done

Backend-dependent PyTorch versions #6671

BlueskyFR opened this issue Oct 1, 2022 · 19 comments
Labels
status/duplicate Duplicate issues

Comments

@BlueskyFR
Copy link

BlueskyFR commented Oct 1, 2022

  • Poetry version: 1.2.0

  • Python version: 3.10.6

  • OS version and name: Arch Linux

  • I have searched the issues of this repo and believe that this is not a duplicate.

  • I have consulted the FAQ and blog for any relevant entries or release notes.

Issue

Hi!

I do deep learning and am currently trying to switch to poetry for better dependencies management 😄
I immediatly encountered a problem when trying to install PyTorch:

  1. PyTorch versions are backend-dependent, so the latest PyTorch version has releases for say 20 different CUDA versions, and constraining everyone cloning the project to the same CUDA version makes no sense, and I guess Poetry cannot currently handle this
  2. PyTorch is not on PyPi, more on that below.

In order to solve the first problem (even if an official support from Poetry would be appreciated), I went with the famous light-the-torch, that automatically installs the right PyTorch version depending on the detected backend.
The problem with this is that torch is not added to pyproject.toml afterwards, so if I do a subsequent poetry install XXX, the torch package has a lot of chances of being replaced by the torch from pip.

Also, specifying the PyTorch url in the .toml is not a solution again since it is dependent on the local backend. In a PyTorch project, the common factor is the PyTorch package version, not the backend on which it runs as the latter just ensures the project can run on a variety of configurations.

This means that Poetry is not currently compatible with PyTorch. I don't think that saying that PyTorch is a special case is a good idea since this release design just exposes the need to compile a package for each particular software stack, so it is really a general problem which must be solved IMO.

I'll be happy to discuss below of how Poetry must adapt to this design!

@BlueskyFR BlueskyFR added kind/bug Something isn't working as expected status/triage This issue needs to be triaged labels Oct 1, 2022
@BlueskyFR BlueskyFR changed the title Hardware-dependent PyTorch versions Backend-dependent PyTorch versions Oct 1, 2022
@neersighted neersighted added status/duplicate Duplicate issues and removed kind/bug Something isn't working as expected status/triage This issue needs to be triaged labels Oct 1, 2022
@BlueskyFR
Copy link
Author

A solution would be to have a post-install hook that runs light-the-torch, but the installed torch with it would be have to be frozen afterwards to ensure it is not replaced by a subsequent poetry install

@neersighted
Copy link
Member

neersighted commented Oct 1, 2022

Duplicate #2145 -- this is in general pretty out of scope for Poetry as things currently exist, but possible with a plugin. Painful, but possible, and we can gradually introduce hooks to make this sort of thing easier.

Keep in mind that this is, in fact, highly specific to PyTorch -- there is no standard convention for distributing wheels built against different ML APIs. PyTorch does it one way, and other ML packages do it in other ways. A standard for describing and reasoning about wheel compatibility is needed for support in Poetry beyond a package-specific plugin, as the +cu111 et al. convention is just that -- adhoc (mis)use of local versions that existing tools interact with in sometimes unexpected (to those not familiar with Python packaging) ways.

In order for broad support in the ecosystem (including natively in Poetry) to happen, standardization of ML APIs/ABIs is necessary as part of the wheel spec (or a successor).

@neersighted neersighted closed this as not planned Won't fix, can't repro, duplicate, stale Oct 1, 2022
@BlueskyFR
Copy link
Author

BlueskyFR commented Oct 1, 2022

I totally agree that there is no standard for this, but the problem is still present so some help could be given on Poetry's side through tools such as hooks and package freezing I hope

@neersighted
Copy link
Member

If you're willing to freeze versions there's no problem -- add the correct pytorch.org package index and you'll be locked to one API version.

Per #6409 performance leaves something to be desired as currently we emulate pip (+ the new resolver)'s behavior of checking every index exhaustively. However, you can do what you want today.

If you are talking about install-time selection of the proper variant, that is a full duplicate of the issue I linked. The best we'll be able to do until such time that we standardize markers in the ecosystem is adding some sort of hooks for custom markers -- but there's a lot of work necessary on the Plugin API before we can even think about such hooks.

@BlueskyFR
Copy link
Author

Downloading 50 GB of package is not an option for me sadly, so this leaves me with no solution I guess

@neersighted
Copy link
Member

It's very unclear what you want, I think. Are you asking for some way to add packages to a Poetry environment using poetry install that bypasses the normal resolution process? Until PyTorch indexes implement PEP 658 (and we gain support), we will have to download wheels for every platform as a result of how Python packaging works at a fundamental level.

#4956 or a successor will help you if you want to limit the scope of compatibility (e.g. you never plan to install on Windows so you don't care about solving for Windows).

Basically, if you don't want to solve ahead of time and have a universal poetry install that works everywhere, Poetry may not be the right tool for you.

If you're willing to accept solving ahead of time requiring downloading PyTorch wheels, poetry export + ltt may just work for you -- many projects (including poetry-core and certbot) make use of Poetry for management of a requirements.txt list.

@Secrus
Copy link
Member

Secrus commented Oct 1, 2022

@BlueskyFR I would try addressing this issue with PyTorch team, since it's them doing non-standard things. I don't like the idea of Poetry, which is based on widely accepted standards, having to adapt to non-standard ways. The way I see it, they could have a simple wheel on PyPI that would provide CLI for setting up a proper environment.

@BlueskyFR
Copy link
Author

BlueskyFR commented Oct 2, 2022

@neersighted sorry for being unclear. What I want is the following:

  1. I run poetry install on the cloned repo, which doesn't have torch in its dependencies
  2. I then run poetry run ltt install torch which installs the latest torch, compatible with my local backend
  3. I then "freeze" PyTorch so that poetry cannot replace it with the pip version if I then run poetry add X

Is it possible?

@BlueskyFR
Copy link
Author

In the same spirit, what if I want to install a custom built PyTorch version?

@neersighted
Copy link
Member

neersighted commented Oct 2, 2022

You're really asking for a feature where you can inject 'fake' packages into Poetry's resolution, so that Poetry considers them satisfied and solved for.

I'd create a new feature request issue for that -- the basic idea is that you would specify something like:

[tool.poetry]
dependencies-external = ["pytorch"]

And Poetry would consider pytorch: * to be provided and act like it was locked/installed, while not in fact locking/installing it at all, and just trusting you, the end user, to install it correctly (e.g. using ltt) so your code can run.

Please note the above design is ad-hoc -- what the final design would look like, and if this would be accepted by the project at all would have to be hammered out on the FR issue you create, and/or on the PR defining the implementation.

@neersighted
Copy link
Member

neersighted commented Oct 2, 2022

In the same spirit, what if I want to install a custom built PyTorch version?

You can do this today with URL dependencies and markers (but, as markers do not include any facility to discriminate based on ML API, this doesn't solve anything you couldn't do already with the pytorch indexes).

@BlueskyFR
Copy link
Author

Thanks for the feedback.
I don't think I have the time to write FR and follow them at the moment, as this is likely to take weeks and I am looking for a quick solution.
So to wrap up, ltt is not compatible with poetry in its current state right?

@neersighted
Copy link
Member

Poetry is not designed to interoperate with other tools that manipulate packages in its dependency tree, no. Even if we add the feature I described above, it will always be a best-effort/"it happens to work" sort of thing. That is to say, you're taking a lot into your own hands, and if Poetry's incomplete solution ignoring a package breaks when combined with LTT's, that's on you to solve, since it's not reasonably a problem with either Poetry or LTT.

@BlueskyFR
Copy link
Author

That's right, but I am just disappointed by the fact that no way to manage dependencies in a PyTorch project 😢

@neersighted
Copy link
Member

neersighted commented Oct 2, 2022

Sorry to hear that -- Poetry works fine for users who are able to ensure a consistent ML API situation across all their install targets. For Poetry to 'just work' across APIs and not require compromises like the proposed feature above, this is a topic for the PyPA, discuss.python.org, and a PEP defining ML APIs as a new wheel tag.

@BlueskyFR
Copy link
Author

Is the issue with the secondary download url being adressed? That would be the beginning of a solution
image

@neersighted
Copy link
Member

neersighted commented Oct 2, 2022

That's purely cosmetic -- it's a consequence of how additional sources are designed, and if you run pip in verbose mode with --extra-index-url you will see it does the same thing (not setting secondary or default duplicates pip --extra-index-url, and default duplicates --extra-index-url; secondary is still unconditionally searched and is a Poetry invention).

#5984 (comment) is a proposal to solve this by breaking the 'purely pip-like' semantics of non-PyPI sources.

@BlueskyFR
Copy link
Author

Ok
Why does Poetry download the wheel file 2 times when specifying the { url = "XXX" }?
It is downloaded once at the resolution and a second time to "upgrade" it by downloading the exact same file again:
image

@neersighted
Copy link
Member

If the wheel wasn't installed by Poetry, it may be missing a PEP 610 marker, aka direct_url.json. That is to say, Poetry will only consider it the same torch version and not reinstall it if the marker exists and matches the URL that Poetry was configured with.

This is getting fairly off topic and turning into more of a support discussion (and I think the original issue was more of question than anything actionable anyway) -- I'm migrating this to Discussions as such.

@python-poetry python-poetry locked and limited conversation to collaborators Oct 2, 2022
@neersighted neersighted converted this issue into discussion #6680 Oct 2, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
status/duplicate Duplicate issues
Projects
None yet
Development

No branches or pull requests

3 participants