Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Substantial debug-test-optimized-depedencies Performance Regression in 1.59 #97460

Open
TheBlueMatt opened this issue May 27, 2022 · 14 comments
Open
Labels
C-bug Category: This is a bug. E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@TheBlueMatt
Copy link

TheBlueMatt commented May 27, 2022

We were trying to change our Cargo.toml to

[profile.dev.package."*"]
opt-level = 2

to get opt on our dependencies during test but not on the crate being tested. In rustc 1.58.1, this is a performance improvement in the generated test suite. On rustc 1.59.0 this is a very substantial regression in total runtime of tests (~2 minutes CPU time on 1.58.1 to 18 minutes CPU time, not including compile time). This effect is robust with different opt-levels (swapping the 1s and 2s above) and between Debian rustc and rustup rustc.

The increase in CPU time seems to almost all be in std/core, perf on the regressed binary for a single test (similar results for all tests) shows the following, but core itself doesn't show up at all in perf on 1.58.1.

+   26.40%    24.37%  ln::functional_  lightning-8c1dd3bce02d4ddc   [.] core::ptr::mut_ptr::<impl *mut T>::is_null
+   25.79%    23.79%  ln::functional_  lightning-8c1dd3bce02d4ddc   [.] <core::slice::iter::IterMut<T> as core::iter::traits::iterator::Iterator>::next
+   23.12%    21.85%  ln::functional_  lightning-8c1dd3bce02d4ddc   [.] core::ptr::mut_ptr::<impl *mut T>::guaranteed_eq
+   11.31%    10.16%  ln::functional_  lightning-8c1dd3bce02d4ddc   [.] std::io::readbuf::ReadBuf::initialize_unfilled_to
+    9.80%     6.16%  ln::functional_  lightning-8c1dd3bce02d4ddc   [.] core::ptr::non_null::NonNull<T>::as_ptr
+    8.65%     6.89%  ln::functional_  lightning-8c1dd3bce02d4ddc   [.] core::ptr::metadata::from_raw_parts_mut
+    2.46%     2.45%  ln::functional_  lightning-8c1dd3bce02d4ddc   [.] core::ptr::non_null::NonNull<T>::new_unchecked

This effect is also robust against -Zbuild-std=panic_abort (and ps aux confirms that core is getting built with opt-level=2) and even RUSTFLAGS="-C prefer-dynamic" (which ldd confirms dynamically links libstd+libtest).

The PR to do this on our repo is at lightningdevkit/rust-lightning#1497 which you should be able to simply check out locally and cargo test with 1.58 and 1.59 (well, run it twice to skip the compile part of the test) to see the regression. Specific numbers at #97460 (comment).

For those reading the comments there was some confusion about the cargo 1.57 change on the test profile, you can skip to #97460 (comment)

@rustbot modify labels: +regression-from-stable-to-stable -regression-untriaged

@TheBlueMatt TheBlueMatt added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels May 27, 2022
@rustbot rustbot added regression-from-stable-to-stable Performance or correctness regression from one stable version to another. I-prioritize Issue: Indicates that prioritization has been requested for this issue. and removed regression-untriaged Untriaged performance or correctness regression. labels May 27, 2022
@TheBlueMatt TheBlueMatt changed the title Substantial Performance Regression in 1.59 Substantial debug-build Performance Regression in 1.59 May 27, 2022
@wesleywiser
Copy link
Member

[profile.dev]
opt-level = 1

Sets the rustc -Copt-level=1 flag when compiling your dependencies as well as your crate.

[profile.dev.package."*"]
opt-level = 2

Sets the rustc -Copt-level=2 flag when compiling only your dependencies, leaving your crate built with -Copt-level=0.

@wesleywiser wesleywiser removed regression-from-stable-to-stable Performance or correctness regression from one stable version to another. I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels May 27, 2022
@TheBlueMatt
Copy link
Author

Sets the rustc -Copt-level=1 flag when compiling your dependencies as well as your crate.

So this changed in cargo/rustc 1.59?

@TheBlueMatt
Copy link
Author

Sets the rustc -Copt-level=1 flag when compiling your dependencies as well as your crate.

Also, this doesn't match ps aux output during build - on 1.58.1 and 1.59 I show no opt-level argument passed to rustc at all for the --test build.

@TheBlueMatt
Copy link
Author

So this changed in cargo/rustc 1.59?

To answer my own question, yes, this changed in 1.59, dev now passes opt-level=1 to rustc on 1.59 on the --test build. That said, it doesn't address the point here - opt-level=0 (looking at rustc calls, at least) on 1.58 led to substantially more performant code.

@TheBlueMatt
Copy link
Author

TheBlueMatt commented May 27, 2022

(FWIW this does match my observation that rustc 1.59 [dev] opt-level=1 runs much faster than 1.58 [dev] opt-level=1, but it seems there's no option to get back to the performance/compile-time tradeoffs we had at 1.58)

@wesleywiser
Copy link
Member

wesleywiser commented May 27, 2022

So this changed in cargo/rustc 1.59?

I'm not aware of any changes to rustc or cargo in the 1.59 release that would have an effect here. (You can find the list of PRs that landed in 1.59 for both projects here and here)

Also, this doesn't match ps aux output during build - on 1.58.1 and 1.59 I show no opt-level argument passed to rustc at all for the --test build.

I see this with Rust 1.59:

$ cat Cargo.toml
[package]
name = "hello_world"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]

[profile.dev]
opt-level = 1

$ strace -e execve -f -- cargo +1.59 test
...
[pid 12014] execve("/home/wesley/.rustup/toolchains/1.59-x86_64-unknown-linux-gnu/bin/rustc", ["/home/wesley/.rustup/toolchains/"..., "--crate-name", "hello_world", "--edition=2021", "src/main.rs", "--error-format=json", "--json=diagnostic-rendered-ansi,"..., "--emit=dep-info,link", "-C", "opt-level=1", "-C", "embed-bitcode=no", "-C", "debuginfo=2", "-C", "debug-assertions=on", "--test", "-C", "metadata=79aa5460ab0a171e", "-C", "extra-filename=-79aa5460ab0a171e", "--out-dir", "/tmp/hello_world/target/debug/de"..., "-C", "incremental=/tmp/hello_world/tar"..., "-L", "dependency=/tmp/hello_world/targ"...], 0x56026762fa60 /* 52 vars */) = 0
...

Which shows -Copt-level=1 being passed to the compiler.

I see the same on 1.58 as well

$ strace -e fork,execve -f -- cargo +1.58 test
...
[pid 12093] execve("/home/wesley/.rustup/toolchains/1.58-x86_64-unknown-linux-gnu/bin/rustc", ["/home/wesley/.rustup/toolchains/"..., "--crate-name", "hello_world", "--edition=2021", "src/main.rs", "--error-format=json", "--json=diagnostic-rendered-ansi", "--emit=dep-info,link", "-C", "opt-level=1", "-C", "embed-bitcode=no", "-C", "debuginfo=2", "-C", "debug-assertions=on", "--test", "-C", "metadata=5eb06840b766433d", "-C", "extra-filename=-5eb06840b766433d", "--out-dir", "/tmp/hello_world/target/debug/de"..., "-C", "incremental=/tmp/hello_world/tar"..., "-L", "dependency=/tmp/hello_world/targ"...], 0x556798a2b0c0 /* 52 vars */) = 0
...

Edit: I realized that is 1.58.1 but I see the same thing on 1.58.0 and 1.57.

@TheBlueMatt
Copy link
Author

Grrrrrrr I'm so sorry, I was looking at the wrong version, the regression was between 1.56 and 1.58, let me go test now.

@wesleywiser
Copy link
Member

wesleywiser commented May 27, 2022

1.57 contained the Cargo "named profiles" feature (rust-lang/cargo#9943) which notes:

The test profile now inherits settings from the dev profile (and bench from release).

From what I'm seeing, prior to 1.57, the above Cargo.toml has no effect on cargo test.

Edit: Another interesting note from that PR:

Artifacts may no longer be shared in some circumstances. This can lead to substantially increased target directory sizes (and build times), particularly if the test profile is not the same as the dev profile. Previously, dependencies would use the dev profile for both. If the user wants to retain the old behavior, they can use an override like [profile.test.package."*"] and set the same settings as dev.

@TheBlueMatt
Copy link
Author

TheBlueMatt commented May 27, 2022

Ah, right, so that makes sense. That doesn't, however, explain why tests got so much slower on 1.59. Ran some numbers with dev-1 and dev.package.*-2:

dev-1 with cargo/rustc 1.56: 2m55 user
dev.package.*-2 with carg/rustc 1.57: 2m15 user
dev.package.*-2 with cargo/rustc 1.58.1: 2m17 user
dev.package.*-2 with cargo/rustc 1.59.0: 15m51 user

@TheBlueMatt TheBlueMatt changed the title Substantial debug-build Performance Regression in 1.59 Substantial debug-test-optimized-depedencies Performance Regression in 1.59 May 27, 2022
@wesleywiser
Copy link
Member

dev.package.-2 with cargo/rustc 1.58.1: 2m17 user
dev.package.
-2 with cargo/rustc 1.59.0: 15m51 user

This is way more than the 25% regression in core::ptr::mut_ptr::<impl *mut T>::is_null your profile found. Is there a chance what we're actually seeing is a change in build time? Could you post the results of cargo +1.58.1 test -vv and cargo +1.59.0 test -vv?

@TheBlueMatt
Copy link
Author

In all the above results that's a second run of cargo test after the first, it was also observed by two of my colleagues in lightningdevkit/rust-lightning#1497

That's why my initial response was to -Zbuild-std but that didn't change anything :(

@TheBlueMatt
Copy link
Author

Sorry, the above profile was a bit confusing - on the 1.58.1 build core/std stuff doesn't show up in perf at all - on 1.59.0 its ~90% of runtime, if you total the second column above.

@jyn514 jyn514 added I-slow Issue: Problems and improvements with respect to performance of generated code. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. labels Jun 18, 2022
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jun 18, 2022
@TheBlueMatt
Copy link
Author

Updated the issue description to remove all the stuff about dev-1 and the stuff that was just the cargo change, leaving only the parts talking about the dev.package.* opt-leve=2 runtime regression.

@apiraino apiraino added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jun 29, 2022
@workingjubilee workingjubilee added P-low Low priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Mar 3, 2023
@workingjubilee workingjubilee added P-medium Medium priority E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example and removed P-low Low priority labels Mar 3, 2023
@workingjubilee
Copy link
Contributor

Tagging this as medium priority because it is such a large regression if correct, but in a less-common combination of compiler settings, so something going awry is both expected but also it going this awry is surprising.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants