❓ [Question] How to solve this warning: Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. #2813

demuxin · 2024-05-06T09:39:02Z

❓ Question

I used Torch-TensorRT to compile the torchscript model in C++. When compiling or loading torchtrt model, it displays many warnings.

WARNING: [Torch-TensorRT] - Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. For more on the implications of this as well as workarounds, see the linked documentation (https://pytorch.org/TensorRT/user_guide/runtime.html#multi-device-safe-mode)
WARNING: [Torch-TensorRT] - Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. For more on the implications of this as well as workarounds, see the linked documentation (https://pytorch.org/TensorRT/user_guide/runtime.html#multi-device-safe-mode)
WARNING: [Torch-TensorRT] - Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. For more on the implications of this as well as workarounds, see the linked documentation (https://pytorch.org/TensorRT/user_guide/runtime.html#multi-device-safe-mode)

What you have already tried

I found this link is useful, but it only provides Python API.

I checked the source code, but I still haven't figured out how to set up MULTI_DEVICE_SAFE_MODE in C++.

What can I do to address this warning?

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

PyTorch Version (e.g., 1.0):
CPU Architecture: x86
OS (e.g., Linux): ubuntu18
How you installed PyTorch (conda, pip, libtorch, source): libtorch
Build command you used (if compiling from source):
Are you using local sources or building from archives:
Python version:
CUDA version: 12.2
GPU models and configuration: 1080Ti
Any other relevant information:

The text was updated successfully, but these errors were encountered:

demuxin · 2024-05-06T10:06:02Z

Moreover, when the model infers using forward function, there are these warnings:

WARNING: [Torch-TensorRT] - Using default stream in enqueue()/enqueueV2()/enqueueV3() may lead to performance issues due to additional cudaDeviceSynchronize() calls by TensorRT to ensure correct synchronizations. Please use non-default stream instead.
WARNING: [Torch-TensorRT] - Using default stream in enqueue()/enqueueV2()/enqueueV3() may lead to performance issues due to additional cudaDeviceSynchronize() calls by TensorRT to ensure correct synchronizations. Please use non-default stream instead.
WARNING: [Torch-TensorRT] - Using default stream in enqueue()/enqueueV2()/enqueueV3() may lead to performance issues due to additional cudaDeviceSynchronize() calls by TensorRT to ensure correct synchronizations. Please use non-default stream instead.

gs-olive · 2024-05-06T19:01:36Z

Hi @demuxin - thanks for the report - we likely need to add getter/setter methods to toggle this value in C++ as well:

TensorRT/core/runtime/runtime.cpp

Line 10 in 07c5b07

bool MULTI_DEVICE_SAFE_MODE = false;

Similar to the following functions:

TensorRT/core/runtime/register_jit_hooks.cpp

Lines 121 to 122 in 07c5b07

    
           m.def("get_multi_device_safe_mode", []() -> bool { return MULTI_DEVICE_SAFE_MODE; }); 
        
           m.def("set_multi_device_safe_mode", [](bool multi_device_safe_mode) -> void {

We are aware of the default stream warning and are working on this. It should not have a substantial effect on inference from what I've seen

gs-olive · 2024-05-06T19:47:52Z

These may actually already be accessible in C++, prior to inference, could you try adding the line:
torch_tensorrt::core::runtime::set_multi_device_safe_mode(true)

demuxin · 2024-05-07T01:28:35Z

Hi @gs-olive, this is not right, there is complie error:

error: ‘set_multi_device_safe_mode’ is not a member of ‘torch_tensorrt::core::runtime’
   27 |         torch_tensorrt::core::runtime::set_multi_device_safe_mode(true);

gs-olive · 2024-05-08T21:45:36Z

Does torch::ops::tensorrt::get_multi_device_safe_mode() exist, or does this also cause a compilation error?

demuxin · 2024-05-09T01:35:48Z

This is a similar compilation error.

error: ‘torch::ops’ has not been declared
   24 |         torch::ops::tensorrt::get_multi_device_safe_mode(true);

I searched the libtorch and torch-TensorRT header files, and there are no functions related to multi_device_safe_mode.

demuxin added the question Further information is requested label May 6, 2024

narendasan assigned gs-olive May 6, 2024

gs-olive mentioned this issue May 9, 2024

feat: Add support for multi-device safe mode in C++ #2824

Merged

gs-olive closed this as completed in #2824 May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

❓ [Question] How to solve this warning: Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. #2813

❓ [Question] How to solve this warning: Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. #2813

demuxin commented May 6, 2024 •

edited

demuxin commented May 6, 2024

gs-olive commented May 6, 2024

gs-olive commented May 6, 2024

demuxin commented May 7, 2024

gs-olive commented May 8, 2024

demuxin commented May 9, 2024

❓ [Question] How to solve this warning: Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. #2813

❓ [Question] How to solve this warning: Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. #2813

Comments

demuxin commented May 6, 2024 • edited

❓ Question

What you have already tried

Environment

demuxin commented May 6, 2024

gs-olive commented May 6, 2024

gs-olive commented May 6, 2024

demuxin commented May 7, 2024

gs-olive commented May 8, 2024

demuxin commented May 9, 2024

demuxin commented May 6, 2024 •

edited