Minor changes to improve compilation speed #137

Aeledfyr · 2024-02-25T03:07:51Z

This crate is small and compiles quickly, but it's also used everywhere, so trying to improve the compilation speed is worth it.

This PR has a number of optimizations; each commit is independent, so feel free to cherry-pick and take whichever changes you want. This PR preserves the current MSRV of 1.13.

Changes:

Add rust-version = "1.13" to the Cargo.toml, to partially document the current MSRV
Change internal methods that use AsRef<Path> to just take &Path
Use loops instead of iterator combinators in parsing functions
Make the debug! macro generate no code when not in use
Use a Vec instead of a HashMap when counting physical cores; using the HashMap requires compiling a lot more code. (The Vec should also be faster.)

My non-scientific benchmarks of a clean build of the library (libc + num_cpus)

in release mode: 0.8s -> 0.7s (-13%)
in debug mode: 0.63s -> 0.60s (-5%)

Measuring the amount of code passed to LLVM, using cargo-llvm-lines:
cargo llvm-lines --lib -p num_cpus (with --release for release builds)

in release mode: 16325 -> 13039 (-20%)
in debug mode: 24263 -> 18781 (-22%)

Iterator combinators have a noticable compilation time overhead because they must be monomorphized and end up passing more code to LLVM. This code does get optimized out, but that takes time and slows down the overall build.

I'm not sure how much of a compile time impact this makes, but this prevents the compiler from having to generate formatting code when the debug macro is not in use.

The current implementation uses a HashMap to deduplicate the output from each core of the same cpu. This commit instead collects the output for each core in to a Vec, and then sorts it to deduplicate physical CPUs. This reduces the code size processed by LLVM by 15-20%, as counted by `cargo llvm-lines --lib -p num_cpus` on both debug and release. These implementations have different performance characteristics: - the HashMap must hash each key, and SipHash is slow on small keys - the number of cores will be small (<1024) so sorting the list should be very fast - the list will likely already be sorted I have not benchmarked this code, but it should be around the same speed or slightly faster (from testing against randomized lists).

Aeledfyr · 2024-02-25T03:26:21Z

Improved benchmarks, only compiling num_cpus by directly invoking rustc (with the command given by cargo's --verbose):

release build: from 364.4 ms ± 7.6 ms to 270.4 ms ± 7.8 ms (-25%)
debug build: from 70.1 ms ± 3.0 ms to 60.4 ms ± 2.7 ms (-13%)

Aeledfyr · 2024-02-26T00:20:33Z

It looks like most of the CI failures are due to an updated Rust toolchain:

Nightly tests fail to compile because of a new warning (and #[deny(warnings)])
Cross compiling to some architectures fails because of a missing rust-std components

The only test that actually fails is test-cgroups, which fails in the initial (non-cgroups) test, because it expects to be run on a machine with 2 cores, but the CI now runs it on a 4-core machine. I'm not 100% sure that that's what is happening, but it may be worth rerunning CI on the main branch to see if that's the issue.

Aeledfyr added 5 commits February 24, 2024 20:43

Partially document current MSRV

4f7d6d8

Remove unnecessary internal generic parameter

b63550c

Optimize compilation time of parsing functions on Linux

ff5dbc9

Iterator combinators have a noticable compilation time overhead because they must be monomorphized and end up passing more code to LLVM. This code does get optimized out, but that takes time and slows down the overall build.

Prevent debug macro from generating format_args when not in use

5c6c743

I'm not sure how much of a compile time impact this makes, but this prevents the compiler from having to generate formatting code when the debug macro is not in use.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor changes to improve compilation speed #137

Minor changes to improve compilation speed #137

Aeledfyr commented Feb 25, 2024

Aeledfyr commented Feb 25, 2024 •

edited

Aeledfyr commented Feb 26, 2024

Minor changes to improve compilation speed #137

Are you sure you want to change the base?

Minor changes to improve compilation speed #137

Conversation

Aeledfyr commented Feb 25, 2024

Aeledfyr commented Feb 25, 2024 • edited

Aeledfyr commented Feb 26, 2024

Aeledfyr commented Feb 25, 2024 •

edited