Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Greatly speed up doctests by compiling compatible doctests in one file #123974

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

GuillaumeGomez
Copy link
Member

@GuillaumeGomez GuillaumeGomez commented Apr 15, 2024

Fixes #75341.

I split the changes as much as possible to make it as easy as possible to review (even though it's a huge lot of code changes...).

The following tests are not included into the combined doctests:

  • compile_fail
  • #![no_std]
  • have invalid AST
  • test_harness
  • no capture
  • --show-output (because the output is nicer without the extra code surrounding it)

Everything else is merged into one file. If this one file fails to compile, we go back to the current strategy: compile each doctest separately.

Because of the edition codeblock attribute, I couldn't make them all into one file directly so I grouped them by edition, but it should be pretty anecdotic.

In case the users want a specific doctest to be opted-out from this doctest merge, I added the standalone codeblock attribute:

/// ```rust,standalone
/// // This doctest will be run in its own process!
/// ```

Now the interesting part, I ran it on a few crates and here are the results (with cargo test --doc to only include doctests):

crate nb doctests before this PR with this PR
sysinfo 227 4.6s 1.11s
geos 157 3.95s 0.45s
core 4604 54.08s 23.20s
std 1147 13.88s 8.82s

r? @notriddle

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Apr 15, 2024
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@GuillaumeGomez
Copy link
Member Author

Seems like where there are enough tests, libtest simply blocks and does nothing. I'll split tests by chunks of ~200 tests at once to start and then check how to improve output.

@GuillaumeGomez GuillaumeGomez changed the title Greatly speed up doctests by compiling compatible doctests in one file Greatly speed up doctests by compiling compatible doctests in (almost) one file Apr 16, 2024
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rustbot
Copy link
Collaborator

rustbot commented Apr 17, 2024

Some changes occurred in run-make tests.

cc @jieyouxu

@rust-log-analyzer

This comment has been minimized.

@petrochenkov
Copy link
Contributor

There should probably be some opt out from this merging.

There are numerous reasons why two pieces of code may cause issues when combined together (they should compete for some common resource, in general, and that is not always detectable by a compiler).
This was previously attempted with run-pass tests from the compiler test suite, but it ended up being too brittle.

@notriddle
Copy link
Contributor

How does this affect env_logger?

@GuillaumeGomez
Copy link
Member Author

GuillaumeGomez commented Apr 17, 2024

It fails in multiple tests:

running 28 tests
test src/fmt/humantime.rs - fmt::humantime::Formatter::timestamp (line 17) ... ok
test src/fmt/mod.rs - fmt (line 21) ... ok
test src/fmt/mod.rs - fmt (line 42) ... ok
test src/fmt/mod.rs - fmt::Formatter (line 121) ... ok
test src/lib.rs - (line 175) ... ok
test src/lib.rs - (line 234) ... ok
test src/logger.rs - logger::Builder::filter (line 382) ... ok
test src/logger.rs - logger::Builder::filter_level (line 360) ... ok
test src/logger.rs - logger::Builder::format (line 231) ... ok
test src/logger.rs - logger::Builder::filter_module (line 341) ... ok
test src/logger.rs - logger::Builder::write_style (line 436) ... ok
test src/logger.rs - logger::Logger::from_default_env (line 600) ... ok
test src/logger.rs - logger::Builder::target (line 415) ... ok
test src/logger.rs - logger::Logger::from_env (line 567) ... ok
test src/logger.rs - logger::Logger::from_env (line 576) ... ok
test src/lib.rs - (line 253) ... FAILED
test src/lib.rs - (line 214) ... FAILED
test src/lib.rs - (line 19) ... FAILED
test src/logger.rs - logger::Builder (line 22) ... FAILED
test src/logger.rs - logger::Builder::from_default_env (line 176) ... FAILED
test src/logger.rs - logger::Builder::from_env (line 96) ... FAILED
test src/logger.rs - logger::Builder::from_env (line 86) ... FAILED
test src/logger.rs - logger::Builder::parse_default_env (line 199) ... FAILED
test src/logger.rs - logger::Builder::new (line 56) ... FAILED
test src/logger.rs - logger::Builder::parse_env (line 123) ... FAILED
test src/logger.rs - logger::Builder::parse_env (line 138) ... FAILED
test src/logger.rs - logger::init_from_env (line 927) ... FAILED
test src/logger.rs - logger::try_init_from_env (line 890) ... FAILED

failures:

---- src/lib.rs - (line 253) stdout ----
thread 'src/lib.rs - (line 253)' panicked at src/logger.rs:499:14:
Builder::init should not be called after logger initialized: SetLoggerError(())
stack backtrace:
   0:     0x564cf5d0b200 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h9cb9ba9de157d448
   1:     0x564cf5d3115b - core::fmt::write::h712ed17a4bea48d7
   2:     0x564cf5cf3ff9 - std::io::Write::write_fmt::ha9b4cf5497604558
   3:     0x564cf5d0afbe - std::sys_common::backtrace::print::h527405b0eb64d489
   4:     0x564cf5d0c09a - std::panicking::default_hook::{{closure}}::h7c8284f1aa6c5571
   5:     0x564cf5d0bdbf - std::panicking::default_hook::haf59e725ae5fe4cb
   6:     0x564cf5ca07fa - test::test_main::{{closure}}::h24de74550d40f595
   7:     0x564cf5d0c749 - std::panicking::rust_panic_with_hook::hc2212689298edd22
   8:     0x564cf5d0c4f6 - std::panicking::begin_panic_handler::{{closure}}::h2058e51712b5fdf6
   9:     0x564cf5d0b419 - std::sys_common::backtrace::__rust_end_short_backtrace::h1dc4a638a0e69753
  10:     0x564cf5d0c227 - rust_begin_unwind
  11:     0x564cf5b2dc16 - core::panicking::panic_fmt::h88caaa095c0a5b53
  12:     0x564cf5b2e586 - core::result::unwrap_failed::ha5ec7624a00a22e4
  13:     0x564cf5b391fe - core::result::Result<T,E>::expect::ha6de1769dcf31264
                               at /home/imperio/rust/rust/library/core/src/result.rs:1034:23
  14:     0x564cf5b3610d - env_logger::logger::Builder::init::h3eaa9af3496d3a3d
                               at /home/imperio/rust/env_logger/src/logger.rs:498:9
  15:     0x564cf5b31e74 - rust_out::__doctest_8::main::hc4539ff026f50669
  16:     0x564cf5b31ee3 - rust_out::__doctest_8::TEST::{{closure}}::hffdb7dfd4fa1d8d0
  17:     0x564cf5b30176 - core::ops::function::FnOnce::call_once::h74d127fff682a19d
  18:     0x564cf5ca5d92 - test::__rust_begin_short_backtrace::h377d4000c87befa2
  19:     0x564cf5cc18b2 - test::types::RunnableTest::run::h656e7414cfb3441c
  20:     0x564cf5ca6056 - test::run_test_in_process::hcb4c940c64146602
  21:     0x564cf5cae2bd - std::sys_common::backtrace::__rust_begin_short_backtrace::h6035574334cd54dd
  22:     0x564cf5cb89c5 - core::ops::function::FnOnce::call_once{{vtable.shim}}::hebcf07f42d5ca27a
  23:     0x564cf5d1757b - std::sys::pal::unix::thread::Thread::new::thread_start::h21bb0d8f9a9dc8fb
  24:     0x7fcbf7494ac3 - start_thread
                               at ./nptl/pthread_create.c:442:8
  25:     0x7fcbf7526850 - __GI___clone3
                               at ./misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
  26:                0x0 - <unknown>

I implemented a fallback in case compilation fails. I could do the same for failing tests. Like that, we can support both current doctests with shared context and others who don't have have this shared context issue can enjoy having doctests running much faster. What do you think?

EDIT: Another solution would be to add a new codeblock attribute like "standalone" to say that a doctest should not be combined with others, but I'm not a big fan of this approach.

@notriddle
Copy link
Contributor

I implemented a fallback in case compilation fails. I could do the same for failing tests.

That only helps with spurious failure. Spurious success is an even worse problem, if a test case accidentally relies on a global resource that another test set up. Any kind of merging has this problem, but it gets worse if the merging only happens sometimes.

This is also a problem with TEST_BATCH_SIZE, I think. If a test accidentally relies on a global resource set up by another test, then the tests might magically stop passing when 250 other tests get sandwiched between them.

@GuillaumeGomez
Copy link
Member Author

The logic currently is as follows: we merge tests, if the compilation of the combined tests failed, we fallback to the current behaviour.

One thing I thought though: we could spawn a new process for each test. That would prevent them from using "in program" common resources.

This is also a problem with TEST_BATCH_SIZE, I think. If a test accidentally relies on a global resource set up by another test, then the tests might magically stop passing when 250 other tests get sandwiched between them.

Yes it's the current issue with env_logger, hence why I'm suggesting to either try running all tests in the batch independently or start a new process for each doctest.

@notriddle
Copy link
Contributor

The troublesome possibility that I'm thinking of is:

  1. You start by setting things up so that all your tests pass in a batch, because one test relies on a process-global resource that another one set up. This is what I called a "spurious passing test."
  2. You change something so that compilation in a batch fails, and the system silently runs all your tests in separate processes instead.
  3. The error you get (something like "unwrap() called on None") is very far away from the change that caused things to break. It's "spooky action at a distance," and we should do better than that.

@GuillaumeGomez
Copy link
Member Author

GuillaumeGomez commented Apr 17, 2024

What about changing the default way to run doctests starting the 2024 edition?

EDIT: With a new standalone attribute allowing a doctest to not be part of the combined doctests.

@notriddle
Copy link
Contributor

That's better than implicit fallbacks.

@GuillaumeGomez
Copy link
Member Author

Agreed. Should I add the new standalone doctest attribute in this PR as well?

@GuillaumeGomez
Copy link
Member Author

Fixed merge conflict.

@camelid
Copy link
Member

camelid commented May 22, 2024

This seems like a good idea in general, but the example given with static mut concerns me. For the whole time they've been around, doctests have been completely isolated from each other, and now there is the potential for "spooky action at a distance" as @notriddle mentioned. Ideally we could automatically detect doctests that (indirectly) used statics and either run them separately or emit a warning that standalone should be added. Maybe it'd be too difficult to implement or risk producing false positives depending on how stdlib types are implemented, etc. Thoughts?

@camelid
Copy link
Member

camelid commented May 22, 2024

If we do consider the static-detection approach, it could potentially be implemented by creating a MIR query uses_global_mut_state or similar that would recursively check if the associated MIR body read or mutated a static mut or called a function that did the same.

@GuillaumeGomez
Copy link
Member Author

I think it'd be a nice improvement but I don't think it should be a part of this first iteration as I'm pretty sure there are cases where using statics in this context is perfectly fine and we'd need to be go through a lot of code to check in which cases it's ok or not.

@camelid
Copy link
Member

camelid commented May 25, 2024

Fair point. I guess this change is reasonable because it matches the way normal unit tests work (all compiled into one binary). Should we consider running all compatible doctests in a single binary in pre-2024 editions, then run those tests individually too, and check if the result didn't match the results of the individual tests (i.e., combined failed but all individual passed, or combined passed but at least one individual failed)? The challenge of course is that it'd cause a further slowdown on pre-2024 tests. Just a thought though to help migration to the new edition behavior.

Also curious what some other people on the team think about this change in behavior.

@GuillaumeGomez
Copy link
Member Author

Fair point. I guess this change is reasonable because it matches the way normal unit tests work (all compiled into one binary). Should we consider running all compatible doctests in a single binary in pre-2024 editions, then run those tests individually too, and check if the result didn't match the results of the individual tests (i.e., combined failed but all individual passed, or combined passed but at least one individual failed)? The challenge of course is that it'd cause a further slowdown on pre-2024 tests. Just a thought though to help migration to the new edition behavior.

I'm not sure to understand what you have in mind in this case. Is it to uncover bugs we might have missed maybe?

Also curious what some other people on the team think about this change in behavior.

That's why we'll have an FCP in any case. :)

@camelid
Copy link
Member

camelid commented May 25, 2024

I'm not sure to understand what you have in mind in this case. Is it to uncover bugs we might have missed maybe?

No, more to detect problems with user code. E.g. if doctests in someone's crate do something like depend on the same mutable global state, we can warn them that this will break in Rust 2024.

@GuillaumeGomez
Copy link
Member Author

I see. I'm not sure it's a good idea though. Maybe adding an option to force this new behaviour on non-2024 edition runs could be a way for users to give it a try?

@camelid
Copy link
Member

camelid commented May 25, 2024

Yeah, we should probably focus on that for now. Could also consider a crater run.

@GuillaumeGomez
Copy link
Member Author

I'm still hoping we can make it before the 2024 edition is done. Let's hope I'm not too optimistic here. ^^'

Copy link
Member

@camelid camelid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments about the docs. In terms of the code itself, I believe we should refactor the doctest code before making this change. It's hard to follow what's happening since test collection, processing, and running are all mixed together—not because of this PR, rather due to the existing code. There are also other issues like functions with tons of arguments, which is improved somewhat here but should ideally be done in a separate PR that doesn't change behavior.

I'm currently working on such a refactor and will open a PR soon. Just to be clear, my goal is not to overhaul the test running system but just try to organize and decouple the code better.

@@ -376,6 +376,58 @@ that the code sample should be compiled using the respective edition of Rust.
# fn foo() {}
```

Starting the 2024 edition[^edition-note], compatible doctests will be merged as one before being
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Starting the 2024 edition[^edition-note], compatible doctests will be merged as one before being
Starting in the 2024 edition[^edition-note], compatible doctests will be merged as one before being

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps also add a sentence explaining that it's for performance reasons? Just so people don't wonder why we're making the change. Could also be worth clarifying that if it fails, the individual tests will be rerun to give an exact error.

@@ -376,6 +376,58 @@ that the code sample should be compiled using the respective edition of Rust.
# fn foo() {}
```

Starting the 2024 edition[^edition-note], compatible doctests will be merged as one before being
run. It means that they will share the process, so any change global/static variables will now
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
run. It means that they will share the process, so any change global/static variables will now
run. It means that they will share the process, so any change to global/static variables will now

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 31, 2024
Comment on lines +18 to +19
.arg("--edition")
.arg(edition)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: add an edition() helper to Rustdoc just like the one on Rustc.

Comment on lines +21 to +22
.arg("--extern")
.arg(format!("foo={}", dep.display()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: add an extern_() helper like the one on Rustc.

@bors
Copy link
Contributor

bors commented Jun 1, 2024

☔ The latest upstream changes (presumably #124577) made this pull request unmergeable. Please resolve the merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-edition-2024 Area: The 2024 edition S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rustdoc should run all doctests in one binary
9 participants