Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Consider removing spdlog dependency for substantial compile time improvements #1222

Open
ahendriksen opened this issue Feb 23, 2023 · 4 comments
Labels
? - Needs Triage Need team to review and classify bug Something isn't working

Comments

@ahendriksen
Copy link
Contributor

Describe the bug

Including the spdlog headers is quite expensive. Just adding #include <spdlog/spdlog.h> to an empty file adds 2.8 seconds to the compilation time. For the pairwise distance kernels in the RAFT library, removing the spdlog include can reduce compile times by 50%.

Steps/Code to reproduce bug

time nvcc -arch sm_70 -I/path/to/spdlog/include -x cu -c <(echo '// empty file')
real    0m1.042s

time nvcc -arch sm_70 -I/path/to/spdlog/include -x cu -c <(echo '#include <spdlog/spdlog.h> ')
real    0m3.840s

Expected behavior
A smaller increase in compile time. For context, including <string> adds on the order of 100ms to the compilation time:

time nvcc -arch sm_70 -I/path/to/spdlog/include -x cu -c <(echo '#include <string> ')
real    0m1.160s

Additional context

RAFT
RAFT also uses spdlog. An issue has been opened there as well.

Reason
The reason that compilation takes much longer is that spdlog instantiates a bunch of templates in every translation unit when used as a header only library. This happens in pattern_formatter::handle_flag_, which is instantiated here. Just adding back the spdlog header doubles the compile times of cicc (device side) and also gcc on the host side.

Precompiled-library
Another option is to not use spdlog as a header only library. The effect can be simulated by defining SPDLOG_COMPILED_LIB. When this is defined, spdlog adds only 0.5 seconds:

time nvcc -DSPDLOG_COMPILED_LIB -arch sm_70 -I/home/ahendriksen/projects/raft-spdlog-issue/cpp/build/_deps/spdlog-src/include -x cu -c <(echo '#include <spdlog/spdlog.h> ')
real    0m1.520s

time nvcc -DSPDLOG_COMPILED_LIB -arch sm_70 -I/home/ahendriksen/projects/raft-spdlog-issue/cpp/build/_deps/spdlog-src/include -x cu -c <(echo '//empty file')
real    0m1.053s
@cjnolet
Copy link
Member

cjnolet commented Mar 12, 2023

What if we were to make the use of header-only SPDLOG optiona (and maybe keep it on default)? While I was investigating the compile times on RAFT, I happened to notice that the only different on the CMAKE side to enable this should be changing the spdlog target that rmm links against.

@harrism
Copy link
Member

harrism commented Mar 13, 2023

PR welcome.

rapids-bot bot pushed a commit that referenced this issue Apr 20, 2023
Related to issue #1222 and also PR #1232. Compared to #1232, this PR might make it able to also have fast builds without precompiling spdlog. 

I include a table below showing which headers transitively include `rmm/logger.hpp` before and after PR (in debug and release builds). These are the rmm headers used by RAFT.

| Header                                              | Before         | After          |
|-----------------------------------------------------|----------------|----------------|
| rmm/cuda_device.hpp                                 | debug  release |                |
| rmm/cuda_stream.hpp                                 | debug  release | debug          |
| rmm/cuda_stream_pool.hpp                            | debug  release | debug          |
| rmm/cuda_stream_view.hpp                            | debug  release |                |
| rmm/device_buffer.hpp                               | debug  release |                |
| rmm/device_scalar.hpp                               | debug  release |                |
| rmm/device_uvector.hpp                              | debug  release |                |
| rmm/device_vector.hpp                               | debug  release |                |
| rmm/exec_policy.hpp                                 | debug  release |                |
| rmm/logger.hpp                                      | debug  release | debug  release |
| rmm/mr/device/aligned_resource_adaptor.hpp          | debug  release |                |
| rmm/mr/device/arena_memory_resource.hpp             | debug  release | debug  release |
| rmm/mr/device/binning_memory_resource.hpp           | debug  release | debug  release |
| rmm/mr/device/callback_memory_resource.hpp          | debug  release |                |
| rmm/mr/device/cuda_async_memory_resource.hpp        | debug  release |                |
| rmm/mr/device/cuda_async_view_memory_resource.hpp   | debug  release |                |
| rmm/mr/device/cuda_memory_resource.hpp              | debug  release |                |
| rmm/mr/device/device_memory_resource.hpp            | debug  release |                |
| rmm/mr/device/failure_callback_resource_adaptor.hpp | debug  release |                |
| rmm/mr/device/fixed_size_memory_resource.hpp        | debug  release | debug  release |
| rmm/mr/device/limiting_resource_adaptor.hpp         | debug  release |                |
| rmm/mr/device/logging_resource_adaptor.hpp          | debug  release | debug  release |
| rmm/mr/device/managed_memory_resource.hpp           | debug  release |                |
| rmm/mr/device/owning_wrapper.hpp                    | debug  release |                |
| rmm/mr/device/per_device_resource.hpp               | debug  release |                |
| rmm/mr/device/polymorphic_allocator.hpp             | debug  release |                |
| rmm/mr/device/pool_memory_resource.hpp              | debug  release | debug  release |
| rmm/mr/device/statistics_resource_adaptor.hpp       | debug  release |                |
| rmm/mr/device/thread_safe_resource_adaptor.hpp      | debug  release |                |
| rmm/mr/device/thrust_allocator_adaptor.hpp          | debug  release |                |
| rmm/mr/device/tracking_resource_adaptor.hpp         | debug  release | debug  release |
| rmm/mr/host/host_memory_resource.hpp                |                |                |
| rmm/mr/host/new_delete_resource.hpp                 |                |                |
| rmm/mr/host/pinned_memory_resource.hpp              | debug  release |                |

Authors:
  - Allard Hendriksen (https://github.com/ahendriksen)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #1241
@vyasr
Copy link
Contributor

vyasr commented Jun 2, 2023

@ahendriksen is this still worth pursuing, or were the changes in #1241 sufficient that we don't need to worry about this anymore? If we do still want this, do we need to reopen #1232?

@ahendriksen
Copy link
Contributor Author

I would say: let's keep it in the back of mind, but the urgency for removing spdlog is gone.

For RAFT, the compile time issues have been mostly resolved by #1241. RAFT works around the inclusion of spdlog in pool_memory_resource.hpp (overview).

For projects starting out with RMM, the out-of-the-box compile time should be greatly improved.

Summary:

  • Issue can be closed as far as RAFT is concerned.
  • Other RAPIDS projects may benefit from Adding option to allow spdlog to be compiled. #1232. I cannot speak for their demand.
  • A nice to have would be to make sure that the pool memory resource also includes spdlog only in debug builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants