Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compatibility with CUDA-enabled pytorch on non-CUDA devices #461

Open
mainrs opened this issue May 7, 2024 · 3 comments
Open

Improve compatibility with CUDA-enabled pytorch on non-CUDA devices #461

mainrs opened this issue May 7, 2024 · 3 comments

Comments

@mainrs
Copy link

mainrs commented May 7, 2024

Problem Description

The pytorch CUDA-enabled libraries are more capable than the CPU-only one. They can also run on the CPU if no CUDA device is available. However, due to a race condition, the current code base calls into the CUDA driver even if one passes --no-cuda as an argument.

The issues are these lines of code:

device = torch.device("cuda" if torch.cuda.is_available() and args.cuda else "cpu")

They should first check if the flag is set and only then call torch.cuda.is_available. That way, the program runs perfectly fine in those scenarios.

Possible Solution

device = torch.device("cuda" if args.cuda and torch.cuda.is_available() else "cpu")
@pseudo-rnd-thoughts
Copy link
Collaborator

Could you clarify why this is a "race condition"?

it shouldn't be cuda unless both cases are true so I don't understand why ordering would matter in this case

@mainrs
Copy link
Author

mainrs commented May 9, 2024

Because Python's and is lazy. If the first option is false, than it doesn't evaluate the second one. If I pass --no-cuda, Python still runs torch.cuda.is_available, even though I specified that I do not want to use CUDA.

On devices that don't have CUDA drivers (or have the stub drivers) but have the CUDA version of PyTorch installed, this throws a runtime error. However, using the CPU on such devices is valid, since the PyTorch library can still function by using the CPU.

@pseudo-rnd-thoughts
Copy link
Collaborator

Ok, that makes sense, I thought that pytorch would be smart enough to not raise a runtime error for this function.
I would make a PR that makes your suggested change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants