New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot install PyTorch 1.13.x with PDM #1732
Comments
if use cuda:
|
@xiaojinwhu If you use cuda 11.7, you actually don't need to add an extra index as you can see above. That's the problem. It should work without adding that extra index. This works with |
I am working on Mac M1 and torch 1.13.1 is installed successfully, without CUDA. So I am afraid I am not able to reproduce it. You can try to research yourself, or if anyone else can help. For example, try to find out why it misses so files but other installers(such as |
I'm having a similar (probably even the same problem) and I suspect the I discovered the following issue with the nvidia libraries (nvidia_cublas_cu11, nvidia_cuda_nvrtc_cu11, etc.): With
As soon as you activate
The content of
I hope that this issue can be fixed somehow (I don't know how standard compliant several packages installing into a common package folder is) because the nvidia packages are the primary reaon I activated |
@michaelze Thanks for the investigation, but the wheel So there might be some other packages that install Ah, yes you list the packages below. The problem is, when they share the namespace Try setting |
I tested your suggestion but the problem still persists. Looking at the PyTorch source code (https://github.com/pytorch/pytorch/blob/v1.13.1/torch/__init__.py#L144) reveals the underlying problem:
So the problem here is, I think,
From looking at the code, PyTorch 2.0.0 might actually work with PDM and |
That special treat does exist, but for PEP 420 namespace packages(package without |
@michaelze It seems to work for me with |
If you install
See this function https://github.com/pytorch/pytorch/blob/v1.13.1/torch/__init__.py#L163.
If you have a local CUDA toolkit 11.7 installation (may also work for 11.8, as long as If you don't have a local toolkit installation, now It seems the installation of pdm add https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp310-cp310-linux_x86_64.whl The disadvantage is |
It doesn't work with latest pdm or pytorch I'm compelled to create a script that runs like this and copy it directly to the cache. cp -r /home/user/.cache/pdm/packages/nvidia_nccl_cu12-2.18.1-py3-none-manylinux1_x86_64/lib/nvidia/nccl /home/user/.cache/pdm/packages/nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64/lib/nvidia/
cp -r /home/user/.cache/pdm/packages/nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64/lib/nvidia/nvtx /home/user/.cache/pdm/packages/nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64/lib/nvidia/
cp -r /home/user/.cache/pdm/packages/nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64/lib/nvidia/cufft /home/user/.cache/pdm/packages/nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64/lib/nvidia/
... It works explicitly, but the user should not ask for it. For example, is it possible to do a workaround that downloads only libraries from nvidia (explicitly named libraries like pdm.toml) directly instead of a symlink(cache_method)? By the way, in my environment, |
Can anyone in this thread check if the issue still exists on the latest PDM? Much appreciated for that. |
Yes, this occurs even in the latest PDM(2.10.3). |
Fine, I'll paste the code comment to give more insight on why it happens: pdm/src/pdm/installers/installers.py Lines 75 to 82 in 837e7d0
PDM only looks at children if the parent dir is a namespace package. And PDM detects a namespace based on these rules: pdm/src/pdm/installers/installers.py Lines 49 to 60 in 837e7d0
So if the package breaks the assumption PDM doesn't know how to create symlinks properly, and I don't think it's something PDM can fix, or you need to disable install.cache for it. |
yes. I understand that PDM is NOT the main cause. The former has the problem that |
The main cause is |
PyTorch side problem is clearly a different issue. This occurs when The solution is to copy everything (without using the cache method), but I think I would like to take advantage of the wonderful feature of linking from the cache. |
For anyone coming here off search engines... I wiped my lock file and .venv, and the following worked for me (thanks to #2425!): pdm config --local install.cache_method symlink_individual |
still not work for pytorch 2.2.0 and latest pdm. I tried symlink_individual, hardlink and pth(I can't find it in document, maybe it's deleted in new version of pdm?) and none of them worked. |
still exists with pytorch 2.2 and pdm 2.12.3. see #2614 |
Make sure you run commands with
-v
flag before pasting the output.Steps to reproduce
pdm add torch
(1.13.1 is the latest version currently.)python -c 'import torch'
.Actual behavior
PyTorch should be imported without any errors.
Expected behavior
Environment Information
I "think" this is related to the fact that PyTorch 1.13.x introduced a new set of dependencies around cuda (pytorch/pytorch#85097). Poetry had issues b/c of this (pytorch/pytorch#88049) but it's since been resolved, but not for pdm. My guess is that it might be b/c pdm installs the cuda dependencies separately from pytorch and b/c of that the pytorch installation doesn't know about them. It's a bummer, b/c I wanted to give pdm a spin for a new project, for now I'm going to have to stick to poetry. :/
The text was updated successfully, but these errors were encountered: