Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sentence-transformers 2.2.2 pulling in nvidia packages #2637

Open
gyezheng opened this issue May 10, 2024 · 12 comments
Open

sentence-transformers 2.2.2 pulling in nvidia packages #2637

gyezheng opened this issue May 10, 2024 · 12 comments

Comments

@gyezheng
Copy link

I am using sentence-transformers-2.2.2.tar.gz while it pulls the following nvidia packages

nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl
nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl
nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl
nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl
nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl
nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl
nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl
nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl
nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl
nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl
nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl

When I search them online, it shows they are under license: NVIDIA Proprietary Software.
Can I freely use sentence-transformers-2.2.2.tar.gz?

Thanks!

@tomaarsen
Copy link
Collaborator

Hello!

Yes, these are requirements by the torch Python package that are needed for you to use CUDA, i.e. a GPU. You can freely use them.

Note that if you don't have a GPU, then you may want to install torch without CUDA support & then install sentence-transformers. You can use this widget and select "CPU" if that's the case. It'll save you some disk space.
But, if you have a GPU, be sure to install with the CUDA support like you've been doing.

  • Tom Aarsen

@gyezheng
Copy link
Author

Thank you for your reply!
We are the CPU only case.
I understand from technical perspectives, we can freely use those Nvidia packages. But any idea about from commercial perspective, can we ship them within our our own commercial product? Any difference between GPU and CPU cases from commercial perspective? Thanks!

@tomaarsen
Copy link
Collaborator

If you're using the CPU only, then you won't need those CUDA packages. You can install it with:

pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install sentence-transformers

(assuming that you're on Linux).
And yes, torch and sentence-transformers have commercially permissive licenses, i.e. you can use these products within (paid) commercial products.

  • Tom Aarsen

@KyeMaloy97
Copy link

KyeMaloy97 commented May 13, 2024

So at the moment, I have been running two pip commands, the first was installing a load of dependencies in a requirements.txt and then the second was installing torch with the index url CPU parameter as you mentioned above.

pip install --no-deps -r requirements.txt
pip install --no-deps -r torch_requirements.txt

Maybe the order of installing sentence-transfromers in the first requirements.txt and then installing torch was pulling the 2.3.0 (with nvidia) version of torch along as well?

@KyeMaloy97
Copy link

If I do pip show torch I see:

Name: torch
Version: 1.13.1+cpu
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /usr/local/lib64/python3.9/site-packages
Requires: typing-extensions
Required-by: sentence-transformers, accelerate

So not sure why/how we are getting the nvidia packages in our scans?

@tomaarsen
Copy link
Collaborator

If I do pip show torch I see:

...

That is rather odd. Perhaps you can pip show cuda... with the CUDA packages to see what they are required by? Because torch with CPU should not require CUDA.

  • Tom Aarsen

@KyeMaloy97
Copy link

KyeMaloy97 commented May 14, 2024

If I run pip show nvidia_cublas... or pip show cuda I get no packages found... I'm not convinced we are downloading the files our scanner thinks were getting as I cannot locate them on disk at all, and in my site-packages folder I dont see anything about nvidia or any .whl files matching what our scanner is finding.

I also think if I was pulling them cuda files, the docker image would be a lot larger (its only 2.5GB ish total, think with CUDA files it would be 8GB+).

pip list gives me:

certifi               2024.2.2
charset-normalizer    3.3.2
click                 8.1.7
contourpy             1.2.1
cycler                0.12.1
eland                 8.12.1
elastic-transport     8.13.0
elasticsearch         8.13.0
filelock              3.14.0
fonttools             4.51.0
fsspec                2024.3.1
huggingface-hub       0.23.0
idna                  3.7
importlib_resources   6.4.0
joblib                1.4.2
kiwisolver            1.4.5
matplotlib            3.8.4
nltk                  3.8.1
numpy                 1.26.4
packaging             24.0
pandas                1.5.3
pillow                10.3.0
pip                   21.2.3
psutil                5.9.8
pyparsing             3.1.2
python-dateutil       2.9.0.post0
pytz                  2024.1
PyYAML                6.0.1
regex                 2024.4.28
requests              2.31.0
safetensors           0.4.3
scikit-learn          1.4.2
scipy                 1.13.0
sentence-transformers 2.2.2
setuptools            53.0.0
six                   1.16.0
tdqm                  0.0.1
threadpoolctl         3.5.0
tokenizers            0.14.1
torch                 1.13.1+cpu
torchvision           0.14.1+cpu
tqdm                  4.66.3
transformers          4.38.0
typing_extensions     4.9.0
urllib3               2.2.1
zipp                  3.18.1

@KyeMaloy97
Copy link

KyeMaloy97 commented May 14, 2024

For extra info I also installed pipdeptree and this was the output...

accelerate==0.29.3
├── huggingface-hub [required: Any, installed: 0.23.0]
│   ├── filelock [required: Any, installed: 3.14.0]
│   ├── fsspec [required: >=2023.5.0, installed: 2024.3.1]
│   ├── packaging [required: >=20.9, installed: 24.0]
│   ├── PyYAML [required: >=5.1, installed: 6.0.1]
│   ├── requests [required: Any, installed: 2.31.0]
│   │   ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
│   │   ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
│   │   ├── idna [required: >=2.5,<4, installed: 3.7]
│   │   └── urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
│   ├── tqdm [required: >=4.42.1, installed: 4.66.3]
│   └── typing_extensions [required: >=3.7.4.3, installed: 4.9.0]
├── numpy [required: >=1.17, installed: 1.26.4]
├── packaging [required: >=20.0, installed: 24.0]
├── psutil [required: Any, installed: 5.9.8]
├── PyYAML [required: Any, installed: 6.0.1]
├── safetensors [required: >=0.3.1, installed: 0.4.3]
└── torch [required: >=1.10.0, installed: 1.13.1+cpu]
    └── typing_extensions [required: Any, installed: 4.9.0]
eland==8.12.1
├── elasticsearch [required: >=8.3,<9, installed: 8.13.0]
│   └── elastic-transport [required: >=8.13,<9, installed: 8.13.0]
│       ├── certifi [required: Any, installed: 2024.2.2]
│       └── urllib3 [required: >=1.26.2,<3, installed: 2.2.1]
├── matplotlib [required: >=3.6, installed: 3.8.4]
│   ├── contourpy [required: >=1.0.1, installed: 1.2.1]
│   │   └── numpy [required: >=1.20, installed: 1.26.4]
│   ├── cycler [required: >=0.10, installed: 0.12.1]
│   ├── fonttools [required: >=4.22.0, installed: 4.51.0]
│   ├── importlib_resources [required: >=3.2.0, installed: 6.4.0]
│   │   └── zipp [required: >=3.1.0, installed: 3.18.1]
│   ├── kiwisolver [required: >=1.3.1, installed: 1.4.5]
│   ├── numpy [required: >=1.21, installed: 1.26.4]
│   ├── packaging [required: >=20.0, installed: 24.0]
│   ├── pillow [required: >=8, installed: 10.3.0]
│   ├── pyparsing [required: >=2.3.1, installed: 3.1.2]
│   └── python-dateutil [required: >=2.7, installed: 2.9.0.post0]
│       └── six [required: >=1.5, installed: 1.16.0]
├── numpy [required: >=1.2.0,<2, installed: 1.26.4]
├── packaging [required: Any, installed: 24.0]
└── pandas [required: >=1.5,<2, installed: 1.5.3]
    ├── numpy [required: >=1.20.3, installed: 1.26.4]
    ├── python-dateutil [required: >=2.8.1, installed: 2.9.0.post0]
    │   └── six [required: >=1.5, installed: 1.16.0]
    └── pytz [required: >=2020.1, installed: 2024.1]
pipdeptree==2.20.0
├── packaging [required: >=23.1, installed: 24.0]
└── pip [required: >=23.1.2, installed: 24.0]
sentence-transformers==2.2.2
├── huggingface-hub [required: >=0.4.0, installed: 0.23.0]
│   ├── filelock [required: Any, installed: 3.14.0]
│   ├── fsspec [required: >=2023.5.0, installed: 2024.3.1]
│   ├── packaging [required: >=20.9, installed: 24.0]
│   ├── PyYAML [required: >=5.1, installed: 6.0.1]
│   ├── requests [required: Any, installed: 2.31.0]
│   │   ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
│   │   ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
│   │   ├── idna [required: >=2.5,<4, installed: 3.7]
│   │   └── urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
│   ├── tqdm [required: >=4.42.1, installed: 4.66.3]
│   └── typing_extensions [required: >=3.7.4.3, installed: 4.9.0]
├── nltk [required: Any, installed: 3.8.1]
│   ├── click [required: Any, installed: 8.1.7]
│   ├── joblib [required: Any, installed: 1.4.2]
│   ├── regex [required: >=2021.8.3, installed: 2024.4.28]
│   └── tqdm [required: Any, installed: 4.66.3]
├── numpy [required: Any, installed: 1.26.4]
├── scikit-learn [required: Any, installed: 1.4.2]
│   ├── joblib [required: >=1.2.0, installed: 1.4.2]
│   ├── numpy [required: >=1.19.5, installed: 1.26.4]
│   ├── scipy [required: >=1.6.0, installed: 1.13.0]
│   │   └── numpy [required: >=1.22.4,<2.3, installed: 1.26.4]
│   └── threadpoolctl [required: >=2.0.0, installed: 3.5.0]
├── scipy [required: Any, installed: 1.13.0]
│   └── numpy [required: >=1.22.4,<2.3, installed: 1.26.4]
├── sentencepiece [required: Any, installed: ?]
├── torch [required: >=1.6.0, installed: 1.13.1+cpu]
│   └── typing_extensions [required: Any, installed: 4.9.0]
├── torchvision [required: Any, installed: 0.14.1+cpu]
│   ├── numpy [required: Any, installed: 1.26.4]
│   ├── pillow [required: >=5.3.0,!=8.3.*, installed: 10.3.0]
│   ├── requests [required: Any, installed: 2.31.0]
│   │   ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
│   │   ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
│   │   ├── idna [required: >=2.5,<4, installed: 3.7]
│   │   └── urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
│   ├── torch [required: ==1.13.1, installed: 1.13.1+cpu]
│   │   └── typing_extensions [required: Any, installed: 4.9.0]
│   └── typing_extensions [required: Any, installed: 4.9.0]
├── tqdm [required: Any, installed: 4.66.3]
└── transformers [required: >=4.6.0,<5.0.0, installed: 4.38.0]
    ├── filelock [required: Any, installed: 3.14.0]
    ├── huggingface-hub [required: >=0.19.3,<1.0, installed: 0.23.0]
    │   ├── filelock [required: Any, installed: 3.14.0]
    │   ├── fsspec [required: >=2023.5.0, installed: 2024.3.1]
    │   ├── packaging [required: >=20.9, installed: 24.0]
    │   ├── PyYAML [required: >=5.1, installed: 6.0.1]
    │   ├── requests [required: Any, installed: 2.31.0]
    │   │   ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
    │   │   ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
    │   │   ├── idna [required: >=2.5,<4, installed: 3.7]
    │   │   └── urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
    │   ├── tqdm [required: >=4.42.1, installed: 4.66.3]
    │   └── typing_extensions [required: >=3.7.4.3, installed: 4.9.0]
    ├── numpy [required: >=1.17, installed: 1.26.4]
    ├── packaging [required: >=20.0, installed: 24.0]
    ├── PyYAML [required: >=5.1, installed: 6.0.1]
    ├── regex [required: !=2019.12.17, installed: 2024.4.28]
    ├── requests [required: Any, installed: 2.31.0]
    │   ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
    │   ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
    │   ├── idna [required: >=2.5,<4, installed: 3.7]
    │   └── urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
    ├── safetensors [required: >=0.4.1, installed: 0.4.3]
    ├── tokenizers [required: >=0.14,<0.19, installed: 0.14.1]
    │   └── huggingface-hub [required: >=0.16.4,<0.18, installed: 0.23.0]
    │       ├── filelock [required: Any, installed: 3.14.0]
    │       ├── fsspec [required: >=2023.5.0, installed: 2024.3.1]
    │       ├── packaging [required: >=20.9, installed: 24.0]
    │       ├── PyYAML [required: >=5.1, installed: 6.0.1]
    │       ├── requests [required: Any, installed: 2.31.0]
    │       │   ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
    │       │   ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
    │       │   ├── idna [required: >=2.5,<4, installed: 3.7]
    │       │   └── urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
    │       ├── tqdm [required: >=4.42.1, installed: 4.66.3]
    │       └── typing_extensions [required: >=3.7.4.3, installed: 4.9.0]
    └── tqdm [required: >=4.27, installed: 4.66.3]
setuptools==53.0.0
tdqm==0.0.1
└── tqdm [required: Any, installed: 4.66.3]

@tomaarsen
Copy link
Collaborator

I think that looks fine, then! In fact, if you increase from sentence_transformers==2.2.2 to a more recent version, then you'll actually lose the NLTK and sentencepiece dependencies. Although they're not particularly big, so I wouldn't worry about it too much.

  • Tom Aarsen

@KyeMaloy97
Copy link

KyeMaloy97 commented May 14, 2024

Do you happen to know if theres a check I can make to just completely know if them nvidia***.whl files got installed? I had a look in /usr/bin and /usr/lib/python3.9/site-packages and didn't find anything, also running find / -iname *.whl and find / -iname "*nvidia*" returns nothing

@tomaarsen
Copy link
Collaborator

Searching for cud might also help, but other than that I'm not sure

@KyeMaloy97
Copy link

KyeMaloy97 commented May 14, 2024

I had a look and it found a load of related files, from torch, torchgen, and transformers... most of the files are like:
/usr/local/lib64/python3.9/site-packages/torch/include/ATen/cuda/CUDATensorMethods.cuh and associated header files or like /usr/local/lib/python3.9/site-packages/transformers/kernels/mra/cuda_kernel.cu

I think these are just source code files from these packages tho, not the Nvidia Propriety Software

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants