New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix FindNVTX.cmake #3421
Fix FindNVTX.cmake #3421
Conversation
Signed-off-by: Max H. Gerlach <git@maxgerlach.de>
cmake/Modules/FindNVTX.cmake
Outdated
@@ -10,7 +10,7 @@ | |||
|
|||
set(HOROVOD_NVTX_INCLUDE $ENV{HOROVOD_NVTX_INCLUDE} CACHE PATH "Folder containing NVIDIA NVTX3 headers") | |||
|
|||
list(APPEND NVTX_ROOT ${CUDAToolkit_LIBRARY_ROOT}) | |||
list(APPEND NVTX_ROOT ${CUDAToolkit_INCLUDE_DIRS}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need to put ${CUDAToolkit_INCLUDE_DIRS}
on line 19 as an extra hint to search include?
CMAKE_PREFIX_PATH
takes the root dir and append bin
, lib
, or include
for the appropriate search: https://cmake.org/cmake/help/latest/variable/CMAKE_PREFIX_PATH.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my system, setting NVTX_ROOT
properly seems to have already done the trick.
Would you say that adding this extra hint would help even more (on other systems)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For CMake 3.12+ (we require 3.13 now), setting <package>_ROOT
correctly should be sufficient for find_path
: https://cmake.org/cmake/help/latest/command/find_path.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It potentially ends up to be the same but CMAKE_PREFIX_PATH
is usually used for the parent (root
) directory instead of the subdir include
. <PackageName>_ROOT
is used in the same way. Specific subdir (bin
/lib
/include
) hints are usually added to the corresponding find_*
HINTS
argument.
On my container CUDAToolkit_LIBRARY_ROOT
is empty, which is weird, so relying on CUDAToolkit_INCLUDE_DIRS
instead seems to be a good idea, but maybe it is set in some cases?
I would just have added CUDAToolkit_INCLUDE_DIRS
to the find_path:
find_path(NVTX_INCLUDE_DIR
NAMES nvtx3/nvToolsExt.h
HINTS ${HOROVOD_NVTX_INCLUDE} ${CUDAToolkit_INCLUDE_DIRS})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the end as you mentioned it works in both cases and here we seem to only need to search for just the include of nvtx. Out of curiosity, do you know why we don't need to find the nvtx lib too libnvToolsExt.so
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, understood. It would probably be clearer not to plug CUDAToolkit_INCLUDE_DIRS
into NVTX_ROOT
(although it works for us as we don't need the .so
).
Out of curiosity, do you know why we don't need to find the nvtx lib too libnvToolsExt.so?
The Nsight Systems User Guide only speaks of the header file: https://docs.nvidia.com/nsight-systems/UserGuide/index.html#nvtx-trace Might it be necessary to explicitly link that library for use cases where one does not attach that profiler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact we are using the nvtx header-only library, so the shared lib is never needed:
https://gitlab.kitware.com/cmake/cmake/-/issues/21377
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with your suggestion, @nvcastet, there's no need to deal with the NVTX_ROOT
variable at all here.
Have you checked all the other places where |
@EnricoMi, very good point. I will give the other replacements a look. Most likely they were false, too. |
Signed-off-by: Max H. Gerlach <git@maxgerlach.de>
Signed-off-by: Max H. Gerlach <git@maxgerlach.de>
Signed-off-by: Max H. Gerlach <git@maxgerlach.de>
Improved two other places where I had thought |
Then I am happy with this, I leave this to @nvcastet to approve |
Assuming CI passes, LGTM. |
Unit Test Results (with flaky tests) 951 files + 76 951 suites +76 11h 2m 7s ⏱️ + 58m 22s For more details on these failures, see this check. Results for commit fc3ebe0. ± Comparison against base commit 046c071. |
CI passes besides known failures with head framework versions (Spark Lightning, TF batchnorm, TF hvd.join on GPU) |
This fixes a bug that I've introduced with #3261.
NVTX would not be found when building with
HOROVOD_CUDA_HOME
to locate CUDA.