Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

❓ [Question] How to solve this warning: Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. #2813

Closed
demuxin opened this issue May 6, 2024 · 6 comments · Fixed by #2824
Assignees
Labels
question Further information is requested

Comments

@demuxin
Copy link

demuxin commented May 6, 2024

❓ Question

I used Torch-TensorRT to compile the torchscript model in C++. When compiling or loading torchtrt model, it displays many warnings.

WARNING: [Torch-TensorRT] - Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. For more on the implications of this as well as workarounds, see the linked documentation (https://pytorch.org/TensorRT/user_guide/runtime.html#multi-device-safe-mode)
WARNING: [Torch-TensorRT] - Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. For more on the implications of this as well as workarounds, see the linked documentation (https://pytorch.org/TensorRT/user_guide/runtime.html#multi-device-safe-mode)
WARNING: [Torch-TensorRT] - Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. For more on the implications of this as well as workarounds, see the linked documentation (https://pytorch.org/TensorRT/user_guide/runtime.html#multi-device-safe-mode)

What you have already tried

I found this link is useful, but it only provides Python API.

I checked the source code, but I still haven't figured out how to set up MULTI_DEVICE_SAFE_MODE in C++.

What can I do to address this warning?

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • PyTorch Version (e.g., 1.0):
  • CPU Architecture: x86
  • OS (e.g., Linux): ubuntu18
  • How you installed PyTorch (conda, pip, libtorch, source): libtorch
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version:
  • CUDA version: 12.2
  • GPU models and configuration: 1080Ti
  • Any other relevant information:
@demuxin demuxin added the question Further information is requested label May 6, 2024
@demuxin
Copy link
Author

demuxin commented May 6, 2024

Moreover, when the model infers using forward function, there are these warnings:

WARNING: [Torch-TensorRT] - Using default stream in enqueue()/enqueueV2()/enqueueV3() may lead to performance issues due to additional cudaDeviceSynchronize() calls by TensorRT to ensure correct synchronizations. Please use non-default stream instead.
WARNING: [Torch-TensorRT] - Using default stream in enqueue()/enqueueV2()/enqueueV3() may lead to performance issues due to additional cudaDeviceSynchronize() calls by TensorRT to ensure correct synchronizations. Please use non-default stream instead.
WARNING: [Torch-TensorRT] - Using default stream in enqueue()/enqueueV2()/enqueueV3() may lead to performance issues due to additional cudaDeviceSynchronize() calls by TensorRT to ensure correct synchronizations. Please use non-default stream instead.

@gs-olive
Copy link
Collaborator

gs-olive commented May 6, 2024

Hi @demuxin - thanks for the report - we likely need to add getter/setter methods to toggle this value in C++ as well:

bool MULTI_DEVICE_SAFE_MODE = false;

Similar to the following functions:
m.def("get_multi_device_safe_mode", []() -> bool { return MULTI_DEVICE_SAFE_MODE; });
m.def("set_multi_device_safe_mode", [](bool multi_device_safe_mode) -> void {

We are aware of the default stream warning and are working on this. It should not have a substantial effect on inference from what I've seen

@gs-olive
Copy link
Collaborator

gs-olive commented May 6, 2024

These may actually already be accessible in C++, prior to inference, could you try adding the line:
torch_tensorrt::core::runtime::set_multi_device_safe_mode(true)

@demuxin
Copy link
Author

demuxin commented May 7, 2024

Hi @gs-olive, this is not right, there is complie error:

error: ‘set_multi_device_safe_mode’ is not a member of ‘torch_tensorrt::core::runtime’
   27 |         torch_tensorrt::core::runtime::set_multi_device_safe_mode(true);

@gs-olive
Copy link
Collaborator

gs-olive commented May 8, 2024

Does torch::ops::tensorrt::get_multi_device_safe_mode() exist, or does this also cause a compilation error?

@demuxin
Copy link
Author

demuxin commented May 9, 2024

This is a similar compilation error.

error: ‘torch::ops’ has not been declared
   24 |         torch::ops::tensorrt::get_multi_device_safe_mode(true);

I searched the libtorch and torch-TensorRT header files, and there are no functions related to multi_device_safe_mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants