Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation of torch and torchvision not happening with Poetry #64520

Closed
kaustubhharapanahalli opened this issue Sep 5, 2021 · 19 comments
Closed
Labels
module: binaries Anything related to official binaries that we release to users module: dependency bug Problem is not caused by us, but caused by an upstream library we use oncall: releng In support of CI and Release Engineering triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@kaustubhharapanahalli
Copy link

kaustubhharapanahalli commented Sep 5, 2021

Hello,

I am trying to install torch and torchvision using Poetry. I am getting the following issue:

Updating dependencies
Resolving dependencies...

  SolverProblemError

  Because torchvision (0.10.0+cu111) depends on torch (1.9.0)
   and  depends on torch (1.9.0+cu111), torchvision is forbidden.
  So, because  depends on torchvision (0.10.0+cu111), version solving failed.

Poetry was installed using pip and have added the following details related to Python version, poetry version, OS Details:

Python version: 3.9.6 (cPython)
Poetry version: 1.1.8
OS: Windows 10 (Version: 21H1; OS Build: 19043.1165)

In Poetry, creation of virtual environment by poetry is disabled.

My pyproject.toml file looks like this:

[tool.poetry]
name = "ikshana"
version = "0.1.0"
description = ""
authors = ["Kaustubh Harapanahalli <kaustubhharapanahalli@gmail.com>"]

[tool.poetry.dependencies]
python = ">=3.9,<3.10"
torch = [
    { url = "https://download.pytorch.org/whl/cu111/torch-1.9.0%2Bcu111-cp39-cp39-win_amd64.whl", platform = "windows" },
    { url = "https://download.pytorch.org/whl/cu111/torch-1.9.0%2Bcu111-cp39-cp39-linux_x86_64.whl", platform "linux" }
]
torchvision = [
    { url = "https://download.pytorch.org/whl/cu111/torchvision-0.10.0%2Bcu111-cp39-cp39-win_amd64.whl", platform = "windows" },
    { url = "https://download.pytorch.org/whl/cu111/torchvision-0.10.0%2Bcu111-cp39-cp39-linux_x86_64.whl", platform = "linux" }
]
scipy = "^1.7.1"
matplotlib = "^3.4.3"
pandas = "^1.3.2"
torchsummary = "^1.5.1"
seaborn = "^0.11.2"
hiddenlayer = "^0.3"
tqdm = "^4.62.2"
imgaug = "^0.4.0"
albumentations = "^1.0.3"
plotly = "^5.2.2"

[tool.poetry.dev-dependencies]
pytest = "^5.2"
black = "^21.7b0"
pylint = "^2.10.2"
pydocstyle = "^6.1.1"
mypy = "^0.910"
pre-commit = "^2.14.0"
isort = "^5.9.3"
jupyter = "^1.0.0"
notebook = "^6.4.3"
jupyterlab = "^3.1.9"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

Versions
Collecting environment information...
PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 10 Home Single Language
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A

Python version: 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:26:21) [MSC v.1929 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.19043-SP0
Is CUDA available: N/A
CUDA runtime version: 11.4.48
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1050 Ti
Nvidia driver version: 471.96
cuDNN version: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\bin\cudnn_ops_train64_8.dll
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] mypy==0.910
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.21.2
[pip3] torchsummary==1.5.1
[conda] Could not collect

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @anjali411 @seemethere @malfet @peterjc123 @mszhanyi @skyline75489 @nbcsm

@mrshenli mrshenli added module: binaries Anything related to official binaries that we release to users module: build Build system issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Sep 7, 2021
@malfet malfet added module: windows Windows support for PyTorch and removed module: build Build system issues labels Sep 13, 2021
@malfet
Copy link
Contributor

malfet commented Sep 13, 2021

Removed module: build (as there are no build problems) and added module: windows

@seemethere
Copy link
Member

seemethere commented Sep 13, 2021

Is this because poetry also considers the extra versioning post + symbol?

This seems to go against the standard python versioning model of just ignore what comes after the + symbol

Edit: Yes according to PEP 440, poetry should be ignoring these local versions: https://www.python.org/dev/peps/pep-0440/#local-version-identifiers

@seemethere seemethere added this to Needs Triage in PyTorch Dev Infra Backlog via automation Sep 13, 2021
@seemethere seemethere removed the module: windows Windows support for PyTorch label Sep 13, 2021
@seemethere
Copy link
Member

Removing the windows label since this isn't necessarily related to windows

@driazati driazati moved this from Needs Triage to Triaged (may need to revisit) in PyTorch Dev Infra Backlog Sep 13, 2021
@seemethere seemethere added the oncall: releng In support of CI and Release Engineering label Sep 13, 2021
@kaustubhharapanahalli
Copy link
Author

kaustubhharapanahalli commented Sep 14, 2021

Is this because poetry also considers the extra versioning post + symbol?
I think so too but was not really sure on how to confirm this.

@ezyang ezyang added module: dependency bug Problem is not caused by us, but caused by an upstream library we use and removed high priority labels Sep 15, 2021
@ezyang
Copy link
Contributor

ezyang commented Sep 15, 2021

If it's really a poetry bug we should file an upstream bug with them

@kaustubhharapanahalli
Copy link
Author

I remember seeing a similar issue for pytorch 1.8 as well. I will link it here if I see it

@kaustubhharapanahalli
Copy link
Author

Few issues that I came across: python-poetry/poetry#4231 python-poetry/poetry#2613

@colindean
Copy link

python-poetry/poetry#4221 seems like it would address the local version…

@seemethere seemethere moved this from Triaged to Backlog in PyTorch Dev Infra Backlog Feb 28, 2022
@seemethere
Copy link
Member

We have PEP 503 compliant indices now so this can be closed

PyTorch Dev Infra Backlog automation moved this from Backlog to Done Feb 28, 2022
@dlamusm
Copy link

dlamusm commented May 5, 2022

So is this fixed? Tested pytorch 1.11.0 cuda 11.3 whlz and still not working for me.

@ThomasRobertFr
Copy link

I agree that this issue is still present with all version of pytorch.

I don't know if pytorch or poetry need to fix this, but I tried pretty much everything, there is no way to install both torch and torchvision with a specific build (cpu, cuXXX). Because torchvision has a dependancy without build tag in it (eg. torchvision 0.10.0+cu111 depends on torch 1.9.0), poetry break with the message in the original post (e.g. 1.9.0 != 1.9.0+cu111).

  Because torchvision (0.10.0+cu111) depends on torch (1.9.0)
   and  depends on torch (1.9.0+cu111), torchvision is forbidden.
  So, because  depends on torchvision (0.10.0+cu111), version solving failed.

@ThomasRobertFr
Copy link

ThomasRobertFr commented Jun 17, 2022

@TCherici I have spent a full day (in May) trying all possible ideas I could think of to solve this issue, and found no real clean solution. The least shitty solution I chose is this:

[tool.poetry.dependencies]
torch = { version = "~1.10.2", optional = true }
torchvision = { version = "^0.11.3", optional = true }

[tool.poetry.extras]
torch = ["torch", "torchvision"]

# Relies on https://github.com/nat-n/poethepoet
[tool.poe.tasks]
install-pytorch = "pip install --force-reinstall --no-deps --no-cache-dir torch==1.10.2+cu111 torchvision==0.11.3+cu111 -f https://download.pytorch.org/whl/torch_stable.html"

To create a dev venv on mac we do:

poetry install -E torch

And in the dockerfile to use on GPU server we do:

poetry install --no-dev
poe install-pytorch

@TCherici
Copy link

@ThomasRobertFr Thank you very much for the thorough explanation!
I am honestly baffled that this is necessary at all, but I'm really glad I don't have to figure it out myself.

@ThomasRobertFr
Copy link

@TCherici As mentioned by #64520 (comment) the bug is clearly on poetry's side. And I agree it's very annoying for people using pytorch. Either you don't use poetry or you have to use hacks to choose the right cuda version...

I just created a new issue there: python-poetry/poetry#5863

@TCherici
Copy link

TCherici commented Jun 17, 2022

I've played around with stuff, and managed to get torch and torchvision to run with poetry.

installation of torch==1.11.0+cu113 and torchvision==0.12.0+cu113:

  • Upgrade poetry to version 1.2.0b2 or later (you might have to uninstall and reinstall poetry completely if your version is <=1.1 and you installed it with the get-poetry.py script).
  • Set torch repo as source: poetry config repositories.torch https://download.pytorch.org/whl/cu113
  • installed torch==1.11.0+cu113 directly from wheel: poetry add https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp38-cp38-linux_x86_64.whl
  • simply installed torchvision with: poetry add torchvision. Poetry got the correct version (with cu113 support) from the torch repo.

I am not certain, but I think that it is important to install torch from wheel, and let torchvision be handled by poetry afterwards instead.

@ThomasRobertFr
Copy link

Indeed the problem is solved in poetry 1.2. I'm waiting for a stable version though...

@TCherici
Copy link

my pyproject.toml (cleaned up):

[tool.poetry]
name = "XXX"
version = "0.1.0"
description = "XXX"
authors = ["XXX"]

[tool.poetry.dependencies]
python = "^3.8"
torch = {url = "https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp38-cp38-linux_x86_64.whl"}
torchvision = "^0.12.0+cu113"

[[tool.poetry.source]]
name = "torch"
url = "https://download.pytorch.org/whl/cu113"
default = false
secondary = false

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

@Bonnevie
Copy link

@ThomasRobertFr from your submitted issue #5863, it seems you only saw the issue resolved with fixed wheel URLs?
The issue seems to persist on 1.2.0b2 when calling:

poetry add torch torchvision torchaudio --source torch

with an appropriately configured secondary source (confirmed that I can install e.g. torchalone):

[[tool.poetry.source]]
name = "torch"
url = "https://download.pytorch.org/whl/cu113"
default = false
secondary = true

which still returns an error hinting at the + tags being used improperly:

> poetry add torch torchvision torchaudio --source torch
The currently activated Python version 3.7.4 is not supported by the project (^3.8).
Trying to find and use a compatible version. 
Using python3.8 (3.8.10)
Using version ^1.11.0+cu113 for torch
Using version ^0.12.0+cu113 for torchvision
Using version ^0.11.0+cu113 for torchaudio

Updating dependencies
Resolving dependencies... (8.0s)

Because no versions of torchaudio match >0.11.0+cu113,<0.12.0
 and torchaudio (0.11.0+cu113) depends on torch (1.11.0), torchaudio (>=0.11.0+cu113,<0.12.0) requires torch (1.11.0).
So, because project-management-experiment depends on both torch (^1.11.0+cu113) and torchaudio (^0.11.0+cu113), version solving failed.

Is it simply a question of this working on master, but not the prerelease? Or is it still open?

@ThomasRobertFr
Copy link

ThomasRobertFr commented Jun 22, 2022

@Bonnevie Hi,

The following file resolved and installed properly with poetry install (1.2.0b2):

[tool.poetry]
name = "test-pep-404"
version = "0.1.0"
description = ""
authors = [""]

[tool.poetry.dependencies]
python = "~3.7"

torch = { version = "^1.10.2", source="torch" }
torchvision = { version = "^0.11.3", source="torch" }
torchaudio = { version = "^0.10.0", source="torch" }

[[tool.poetry.source]]
name = "torch"
url = "https://download.pytorch.org/whl/cu113"
default = false
secondary = true

[build-system]
requires = ["poetry_core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

Are you sure you're using poetry 1.2.0b2 ? On the server I tried it it installed in a different place than poetry 1.1 and was thus initially not in my path.

However, with your poetry add command, I get:

poetry add torch torchvision torchaudio --source torch
Using version ^1.11.0 for torch
Using version ^0.12.0 for torchvision
Using version ^0.11.0 for torchaudio

Updating dependencies
Resolving dependencies... (1.1s)

  RepositoryError

  403 Client Error: Forbidden for url: https://download.pytorch.org/whl/cu113/pillow/

Which is a different error compared to you, and I'm not getting the same versions, my poetry does not add +cu113 in them. That's why I suspect you don't have the right poetry version.

This 403 issue is already here for poetry: python-poetry/poetry#4885

diegoquintanav added a commit to EdAbati/fsdl-2022-weak-supervision-project that referenced this issue Oct 5, 2022
need to install cuda-compatible torch by passing
a suffix, which can be done as explained in pytorch/pytorch#64520 (comment)

took me hours but gpu is usable
diegoquintanav added a commit to EdAbati/fsdl-2022-weak-supervision-project that referenced this issue Oct 8, 2022
need to install cuda-compatible torch by passing
a suffix, which can be done as explained in pytorch/pytorch#64520 (comment)

took me hours but gpu is usable
EdAbati added a commit to EdAbati/fsdl-2022-weak-supervision-project that referenced this issue Oct 10, 2022
* add compose extension for running gpus

* move container to use poetry instead of anaconda

* update dependencies in jupyter container

* add docker volume for caching data in jupyter container

* update CLI skeleton

* update dependencies

* add placeholders for HF and WANDB tokens

* add git and git-lfs to dockerfile

needed by huggingface

* add basic training script

* add volume for persisting virtualenvironment

* fix pytorch dependency issue

need to install cuda-compatible torch by passing
a suffix, which can be done as explained in pytorch/pytorch#64520 (comment)

took me hours but gpu is usable

* log model automatically

https://docs.wandb.ai/guides/integrations/huggingface

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated the file with current changes.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* implement streamlit dashboard

* minor fixes

bump style and dependencies
pointer to app should be myapp instead

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* A basic app for streamlit without serverless

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* housekeeping

* update sample .env

rename backend service to restapi

* rename dashboard service to streamlit-ui

* update dependencies and update streamlit script with kushal's input

* update dependencies of streamlit service and fix bound pyproject.toml

* fix CLI and pyproject.toml

* bump streamlit app

only update aesthetics, use altair and emojis for labels

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix docker files and pypoetry.toml

* rename service to lambda

* add entry for lambda service to docker compose

* link streamlit ui to lambda-aws container

* update lambda api container

define compose service
update requirements
update Dockerfile

* update data loading method

* add missing HF token to compose

* update torch repo url to match cuda 11.6

* rename streamlit service

* update dockerfiles so they install their CLIs

* add test model functionality

* add upload to s3 bucket method

* rename streamlit-ui again

I renamed it back accidentally

* add download from s3 to lambda api

* Revert "add download from s3 to lambda api"

This reverts commit 50c65be.

* remove venv volumes from containers

* Update README and Makefile

adds instructions to run services with docker compose

* Updated file name in Makefile

* hotfix: add boto3 to jupyter container dependencies

* wraps training script into CLI

* expose train command from jupyter CLI through make

* hotfix: bad argument name

* add gpu support to train command in Make

* updated pre-commit

* removed unused imports in app folder

* removed unused comments and import

* Update services/jupyter/src/app/model/train_model.py

Co-authored-by: Edoardo Abati <29585319+EdAbati@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused code

* remove unused code

* add a README for the streamlit service

* remove restapi service

it will  be replaced by the lambda service

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kushal Atul Ramaiya <kushalramaiya@gmail.com>
Co-authored-by: Edoardo Abati <29585319+EdAbati@users.noreply.github.com>
EdAbati added a commit to EdAbati/fsdl-2022-weak-supervision-project that referenced this issue Oct 14, 2022
* created notebook service

* added notebook for active learning

* minor fix of .env.sample

* updated conda env

* added active_learning_lib_cache to gitignore

* moved requirements to conda env

* fixed dockerfile

* updated notebook

* temporary docker-compose for dev

* added some env vars in .env

* updated notebook

* change port temp

* added modal test notebook

* add compose extension for running gpus

* move container to use poetry instead of anaconda

* update dependencies in jupyter container

* add docker volume for caching data in jupyter container

* update CLI skeleton

* update dependencies

* add placeholders for HF and WANDB tokens

* add git and git-lfs to dockerfile

needed by huggingface

* add basic training script

* add volume for persisting virtualenvironment

* fix pytorch dependency issue

need to install cuda-compatible torch by passing
a suffix, which can be done as explained in pytorch/pytorch#64520 (comment)

took me hours but gpu is usable

* log model automatically

https://docs.wandb.ai/guides/integrations/huggingface

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated the file with current changes.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* implement streamlit dashboard

* minor fixes

bump style and dependencies
pointer to app should be myapp instead

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* A basic app for streamlit without serverless

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* housekeeping

* update sample .env

rename backend service to restapi

* rename dashboard service to streamlit-ui

* update dependencies and update streamlit script with kushal's input

* update dependencies of streamlit service and fix bound pyproject.toml

* fix CLI and pyproject.toml

* bump streamlit app

only update aesthetics, use altair and emojis for labels

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix docker files and pypoetry.toml

* rename service to lambda

* add entry for lambda service to docker compose

* link streamlit ui to lambda-aws container

* update lambda api container

define compose service
update requirements
update Dockerfile

* update data loading method

* add missing HF token to compose

* update torch repo url to match cuda 11.6

* rename streamlit service

* update dockerfiles so they install their CLIs

* add test model functionality

* add upload to s3 bucket method

* rename streamlit-ui again

I renamed it back accidentally

* add download from s3 to lambda api

* Revert "add download from s3 to lambda api"

This reverts commit 50c65be.

* remove venv volumes from containers

* Update README and Makefile

adds instructions to run services with docker compose

* Updated file name in Makefile

* hotfix: add boto3 to jupyter container dependencies

* wraps training script into CLI

* expose train command from jupyter CLI through make

* hotfix: bad argument name

* add gpu support to train command in Make

* updated pre-commit

* removed unused imports in app folder

* removed unused comments and import

* Update services/jupyter/src/app/model/train_model.py

Co-authored-by: Edoardo Abati <29585319+EdAbati@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused code

* remove unused code

* add a README for the streamlit service

* remove restapi service

it will  be replaced by the lambda service

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add rubrix and loguru to jupyter service

* add snorkel to dependencies

* add rubrix url env var

* create a copy of active learning notebooks inside jupyter container

* add modAL and skorch to dependencies

* update load_data method to support bundled dataset

* add small_text to dependencies

* add rubrix[listeners] to dependencies

* update loading data in small-text notebook

* update RUBRIX_API_URL usage in notebook

* update data loading method and remove unnecesary installs

* add active learning loop with Rubrix and small-text

* fix:load full dataset + update rb url

Co-authored-by: Edoardo Abati <29585319+EdAbati@users.noreply.github.com>
Co-authored-by: Diego Quintana <daquintanav@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kushal Atul Ramaiya <kushalramaiya@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: binaries Anything related to official binaries that we release to users module: dependency bug Problem is not caused by us, but caused by an upstream library we use oncall: releng In support of CI and Release Engineering triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Development

No branches or pull requests

10 participants