Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to stable Future trait, async/await and Tokio 0.2 runtime #985

Merged
merged 8 commits into from Nov 10, 2021

Conversation

Xanewok
Copy link
Collaborator

@Xanewok Xanewok commented Apr 9, 2021

Builds on #945. Rebased and streamlined version of #946 (thanks to @Marwes and @drahnr for doing most of the work!).

This aims to be a somewhat minimal but complete upgrade to stable Future trait and async/await.

At first I tried to split the changes into respective commits but I quickly found that that e.g. migrating storages leads to reqwest upgrade, which itself causes other places to be upgraded and so on. Rather than artificially setting up boundaries, migrating everything to futures_03 and .compat() everywhere and switching them back in following commits, I figured that this should be probably reviewed as a whole. Let me know if you really prefer to have it split more granularly somehow.

Most notably:

  • Upgrades tokio to 0.2, which supports stable Futures.
  • Migrates from futures_03::executor::ThreadPool to regular tokio::runtime::Runtime.
    Because Tokio-enabled I/O such as tokio v0.2 child processes or timeouts all implicitly now need to be spawned in a Tokio context, rather than creating a custom executor that uses ThreadPool but polls relevant Tokio-related futures in a context of a separately running runtime, we use a single Tokio runtime everywhere configured with at least the same amount of worker threads as before and with enabled blocking feature. This is designed for I/O-bound tasks (rather than CPU-bound ones) and so it fits the existing use case perfectly since we mostly used spawn_fn for operations related to disk I/O.
  • Updates reqwest to 0.10 (and by extension hyper to 0.13), supporting stable Futures.
    That's to easily support async/await in regular mode; in distributed compilation we use Tokio runtime now so it's also easier there to use Future-enabled reqwest.
  • Uses async_trait crate to migrate crucial traits like Storage, Client to async/await.
    This could be skipped in theory but that's mean sticking Pin<Box<dyn Future<Output = XYZ> + Send + 'lifetime as a return type everywhere, since async fns in traits are not supported yet due to lack of GATs (see here)
  • Migrates from futures v0.1 chains to async/await whenever possible.

I tried to retain the existing semantics as much as possible and refrain from any non-trivial refactoring to ease the review and to help this land. Let me know if everything is in order and whether you want me to change anything here.

@Xanewok
Copy link
Collaborator Author

Xanewok commented Apr 12, 2021

This triggers a Clippy lint due to to many arguments (the existing function already seem to have 8, including &self? Maybe it's due to desugaring done by async_trait) and needs one more fix to Windows server startup but everything else should be good to review.

@glandium would it be possible for you to take a stab at reviewing this this week (or get other contributors to review this to decrease the required review bandwidth)?

@jblazquez
Copy link

jblazquez commented Apr 13, 2021

This PR couldn't have arrived at a better time. We literally started using sccache a few days ago and ran into an issue on Windows where the server process would stop reading from sockets and hang the entire build process. We figured it might be a bug in the very old tokio 0.1 crate, so we tried the changes in this PR and indeed they solve the hanging issue on Windows. Thank you so much for your work on this!

However, there is one new Windows-only issue that the changes here introduce, and it's somehow related to the rustup proxy detection logic, but I don't know exactly what might be causing it. When using rustup and trying to build with these changes, compilation always fails with this message:

sccache: error: failed to execute compile
sccache: caused by: Compiler not supported: "error: couldn\'t read +stable: The system cannot find the file specified. (os error 2)\n\nerror: aborting due to previous error\n\n"

Removing this section of code to effectively disable the proxy detection logic solves the problem. It seems like (only on Windows for some reason) the proxy logic ends up invoking the resolved rustc compiler with the +stable argument, which results in a failure.

Server trace logs below:

[2021-04-13T02:42:29Z TRACE sccache::cmdline] parse
[2021-04-13T02:42:29Z DEBUG sccache::config] Attempting to read config file at "C:\\Users\\jblazquez\\AppData\\Roaming\\Mozilla\\sccache\\config\\config"
[2021-04-13T02:42:29Z DEBUG sccache::config] Couldn't open config file: The system cannot find the path specified. (os error 3)
[2021-04-13T02:42:29Z TRACE sccache::commands] Command::InternalStartServer
[2021-04-13T02:42:29Z INFO  sccache::server] start_server: port: 4226
[2021-04-13T02:42:29Z INFO  sccache::server] No scheduler address configured, disabling distributed sccache
[2021-04-13T02:42:29Z INFO  sccache::cache::cache] No configured caches successful, falling back to default
[2021-04-13T02:42:29Z TRACE sccache::cache::cache] Using DiskCache("C:\\Users\\jblazquez\\AppData\\Local\\Mozilla\\sccache\\cache", 10737418240)
[2021-04-13T02:42:29Z INFO  sccache::server] server started, listening on port 4226
[2021-04-13T02:42:39Z TRACE sccache::server] incoming connection
[2021-04-13T02:42:39Z TRACE sccache::server] handle_client
[2021-04-13T02:42:39Z DEBUG sccache::server] handle_client: compile
[2021-04-13T02:42:39Z TRACE sccache::server] compiler_info
[2021-04-13T02:42:39Z TRACE sccache::server] compiler_info cache miss
[2021-04-13T02:42:39Z TRACE sccache::compiler::compiler] detect_compiler: C:\Users\jblazquez\.cargo\bin\rustc.EXE
[2021-04-13T02:42:40Z DEBUG sccache::compiler::compiler] Found rustc
[2021-04-13T02:42:40Z TRACE sccache::compiler::rust] proxy: Found a compiler proxy managed by rustup
[2021-04-13T02:42:40Z WARN  sccache::compiler::rust] proxy: rustup found, but not where it was expected (next to rustc C:\Users\jblazquez\.cargo\bin\rustc.EXE)
[2021-04-13T02:42:40Z TRACE sccache::compiler::rust] PROXY rustup --version produced: rustup 1.23.1 (3df2264a9 2020-11-30)

[2021-04-13T02:42:40Z TRACE sccache::compiler::compiler] Found rustup proxy executable
[2021-04-13T02:42:40Z TRACE sccache::compiler::rust] proxy: rustup which rustc produced: "C:\\Users\\jblazquez\\.rustup\\toolchains\\stable-x86_64-pc-windows-msvc\\bin\\rustc.exe"
[2021-04-13T02:42:40Z TRACE sccache::compiler::compiler] Resolved path with rustup proxy "C:\\Users\\jblazquez\\.rustup\\toolchains\\stable-x86_64-pc-windows-msvc\\bin\\rustc.exe"
[2021-04-13T02:42:40Z TRACE sccache::compiler::rust] Discovering dependencies of T:\Temp\sccache-rlibreaderYh42Y4\x.rlib
[2021-04-13T02:42:40Z TRACE sccache::util] Hashed 3 files in 0.127 s
[2021-04-13T02:42:40Z TRACE sccache::server] Inserting new path proxy "C:\\Users\\jblazquez\\.cargo\\bin\\rustc.EXE" @ "T:\\Temp\\sccache" -> "C:\\Users\\jblazquez\\.cargo\\bin\\rustc.EXE"
[2021-04-13T02:42:40Z TRACE sccache::server] Inserting POSSIBLY PROXIED cache map info for "C:\\Users\\jblazquez\\.cargo\\bin\\rustc.EXE"
[2021-04-13T02:42:40Z DEBUG sccache::server] check_compiler: Supported compiler
[2021-04-13T02:42:40Z DEBUG sccache::server] parse_arguments: CannotCache(-): ["-", "--crate-name", "___", "--print=file-names", "--crate-type", "bin", "--crate-type", "rlib", "--crate-type", "dylib", "--crate-type", "cdylib", "--crate-type", "staticlib", "--crate-type", "proc-macro", "-Csplit-debuginfo=packed"]
[2021-04-13T02:42:41Z TRACE sccache::server] incoming connection
[2021-04-13T02:42:41Z TRACE sccache::server] handle_client
[2021-04-13T02:42:41Z DEBUG sccache::server] handle_client: compile
[2021-04-13T02:42:41Z TRACE sccache::server] compiler_info
[2021-04-13T02:42:41Z TRACE sccache::compiler::rust] proxy: rustup which rustc produced: "C:\\Users\\jblazquez\\.rustup\\toolchains\\stable-x86_64-pc-windows-msvc\\bin\\rustc.exe"
[2021-04-13T02:42:41Z TRACE sccache::server] compiler_info cache miss
[2021-04-13T02:42:41Z TRACE sccache::compiler::compiler] detect_compiler: C:\Users\jblazquez\.cargo\bin\rustc.EXE
[2021-04-13T02:42:41Z DEBUG sccache::compiler::compiler] Found rustc
[2021-04-13T02:42:41Z TRACE sccache::compiler::rust] proxy: Found a compiler proxy managed by rustup
[2021-04-13T02:42:41Z WARN  sccache::compiler::rust] proxy: rustup found, but not where it was expected (next to rustc C:\Users\jblazquez\.cargo\bin\rustc.EXE)
[2021-04-13T02:42:41Z TRACE sccache::compiler::rust] PROXY rustup --version produced: rustup 1.23.1 (3df2264a9 2020-11-30)

[2021-04-13T02:42:41Z TRACE sccache::compiler::compiler] Found rustup proxy executable
[2021-04-13T02:42:42Z TRACE sccache::compiler::rust] proxy: rustup which rustc produced: "C:\\Users\\jblazquez\\.rustup\\toolchains\\stable-x86_64-pc-windows-msvc\\bin\\rustc.exe"
[2021-04-13T02:42:42Z TRACE sccache::compiler::compiler] Resolved path with rustup proxy "C:\\Users\\jblazquez\\.rustup\\toolchains\\stable-x86_64-pc-windows-msvc\\bin\\rustc.exe"
[2021-04-13T02:42:42Z TRACE sccache::compiler::rust] Discovering dependencies of T:\Temp\sccache-rlibreaderx9mpD7\x.rlib
[2021-04-13T02:42:42Z TRACE sccache::util] Hashed 3 files in 0.119 s
[2021-04-13T02:42:42Z TRACE sccache::server] Inserting new path proxy "C:\\Users\\jblazquez\\.cargo\\bin\\rustc.EXE" @ "T:\\Temp\\sccache" -> "C:\\Users\\jblazquez\\.rustup\\toolchains\\stable-x86_64-pc-windows-msvc\\bin\\rustc.exe"
[2021-04-13T02:42:42Z TRACE sccache::server] Inserting POSSIBLY PROXIED cache map info for "C:\\Users\\jblazquez\\.rustup\\toolchains\\stable-x86_64-pc-windows-msvc\\bin\\rustc.exe"
[2021-04-13T02:42:42Z DEBUG sccache::server] check_compiler: Supported compiler
[2021-04-13T02:42:42Z DEBUG sccache::server] parse_arguments: CannotCache(-): ["-", "--crate-name", "___", "--print=file-names", "--crate-type", "bin", "--crate-type", "rlib", "--crate-type", "dylib", "--crate-type", "cdylib", "--crate-type", "staticlib", "--crate-type", "proc-macro", "--print=sysroot", "--print=cfg"]
[2021-04-13T02:42:43Z TRACE sccache::server] incoming connection
[2021-04-13T02:42:43Z TRACE sccache::server] handle_client
[2021-04-13T02:42:43Z DEBUG sccache::server] handle_client: compile
[2021-04-13T02:42:43Z TRACE sccache::server] compiler_info
[2021-04-13T02:42:43Z TRACE sccache::server] compiler_info cache miss
[2021-04-13T02:42:43Z TRACE sccache::compiler::compiler] detect_compiler: C:\Users\jblazquez\.rustup\toolchains\stable-x86_64-pc-windows-msvc\bin\rustc.EXE
[2021-04-13T02:42:43Z DEBUG sccache::compiler::compiler] Found rustc
[2021-04-13T02:42:43Z TRACE sccache::server] Inserting PLAIN cache map info for "C:\\Users\\jblazquez\\.rustup\\toolchains\\stable-x86_64-pc-windows-msvc\\bin\\rustc.EXE"
[2021-04-13T02:42:43Z DEBUG sccache::server] check_compiler: Unsupported compiler: error: couldn't read +stable: The system cannot find the file specified. (os error 2)

    error: aborting due to previous error

Repro environment:

  • Windows 10
  • Latest rustup (1.23.1) installed via rustup-init.exe using all the default options.

Repro steps:

  1. Download and run rustup-init.exe. Select all the default options.
  2. Clone this PR and build sccache (examples below using Git Bash):
$ git clone https://github.com/mozilla/sccache.git
$ cd sccache
$ git fetch origin pull/985/head:pr985
$ git checkout pr985
$ cargo build
  1. Run server:
$ SCCACHE_LOG=sccache=trace SCCACHE_NO_DAEMON=1 SCCACHE_START_SERVER=1 target/debug/sccache.exe
  1. Run compilation on another terminal:
$ RUSTC_WRAPPER=$(pwd)/target/debug/sccache.exe cargo build --release -j1

@drahnr
Copy link
Collaborator

drahnr commented Apr 13, 2021

@jblazquez could you file a separate issue with the information provided, it seems this is related to #666 rather than this PR

@jblazquez
Copy link

Hi @drahnr, the issue does not occur with the code in master, only this PR.

@Xanewok
Copy link
Collaborator Author

Xanewok commented Apr 13, 2021

Thanks @jblazquez for a detailed repro, I'll surely take a look!

In the meantime I fixed the distributed compilation test failure and adapted remaining Windows code, so hopefully CI will finally be happy now. Some of the changes were tiny fixups to the big migration commit, so I combined those and force-pushed, while f79c6f6 changes the logic slightly and fixes a hang in dist. compilation.

@jblazquez
Copy link

Awesome, thanks @Xanewok. With your 1ebbe5d commit I can confirm that the proxy issue is resolved.

@Xanewok
Copy link
Collaborator Author

Xanewok commented Apr 13, 2021

That's great to hear! I've been meaning to ping you once CI is green.

PS I hope this can indirectly make League at least a bit better somehow 😎

@Logarithmus
Copy link

@Xanewok sorry, I'm new to async Rust, so I don't understand why are you migrating to tokio 0.2, while tokio 1.2 is already available.

@drahnr
Copy link
Collaborator

drahnr commented Apr 18, 2021

@Xanewok sorry, I'm new to async Rust, so I don't understand why are you migrating to tokio 0.2, while tokio 1.2 is already available.

The initial migration started at a time when tokio 1.0 was not released yet. It's anticipated that the migration to 1.x will be much less work compared to this PR.

@drahnr drahnr mentioned this pull request Apr 18, 2021
@Xanewok
Copy link
Collaborator Author

Xanewok commented Apr 18, 2021

@Xanewok sorry, I'm new to async Rust, so I don't understand why are you migrating to tokio 0.2, while tokio 1.2 is already available.

The initial migration started at a time when tokio 1.0 was not released yet. It's anticipated that the migration to 1.x will be much less work compared to this PR.

Tokio 0.2 is the first version that supports stable Future trait, has been around for quite some time, and is only one version away, so it's a good migration target in general.

Moreover, Tokio 0.3 was only slightly changed in preparation for stable Tokio 1.0 release, both of which unfortunately skip support for named pipes in Windows (preferable IPC solution prior to Windows 10 1803, AF_UNIX sockets are available in later versions if needed) - that's pending on tokio-rs/tokio#3388.

With that said, I'd expect the migration to work as follows:

  1. Support for Tokio 0.2 lands via this PR
  2. In the meantime, Add initial named pipes support tokio-rs/tokio#3388 lands or we skip named pipes on Windows (unlikely)
  3. Only then we migrate to Tokio 1.0 enabled libraries

While we're at it, it might be a good idea to get some data on Windows usage. @jblazquez what Windows build/version do you expect to run sccache on?

@jblazquez
Copy link

@Xanewok, in our case, it's Windows 10 for devs and Windows Server 2019 for build machines, both build 1809 or later.

@Xanewok
Copy link
Collaborator Author

Xanewok commented Apr 27, 2021

@glandium @luser gentle ping; would it be possible to have that reviewed and maybe merged in some form? 🙇

@alnr
Copy link
Contributor

alnr commented Jul 23, 2021

Gentle bump :)

@sylvestre
Copy link
Collaborator

Unfortunately, this patch doesn't apply anymore. would it be possible to have a rebase? thanks

@Xanewok Xanewok force-pushed the migrate-to-stable-future branch 2 times, most recently from 0ed8d00 to ccab162 Compare November 9, 2021 21:14
@Xanewok
Copy link
Collaborator Author

Xanewok commented Nov 9, 2021

Rebased and CI is somewhat happy (needs #1064).

@sylvestre do you think it can be merged now? I'll work on the Tokio 1.0 upgrade right after.

Markus Westerlind and others added 6 commits November 9, 2021 23:16
Co-authored-by: Igor Matuszewski <xanewok@gmail.com>
This fixes a bug introduced during the futures rewrite - we eagerly
returned from the function rather than, as we did originally, falling
back to compiling without proxy.
This fixes a run-time error otherwise encountered in tests/oauth.rs
@Xanewok
Copy link
Collaborator Author

Xanewok commented Nov 9, 2021

Rebased over now-merged #1064, let's see if CI is fully green now.

@sylvestre
Copy link
Collaborator

Terrific! Did you notice performance improvements or regressions?

@Xanewok
Copy link
Collaborator Author

Xanewok commented Nov 10, 2021

That's a good question! We didn't establish enough metrics internally to measure the performance accordingly; the original motivation for this PR was to migrate to stable std traits and so I can't say for sure that the performance is better or worse. I would be surprised, however, if it did regress at all or by a meaningful margin!

@sylvestre sylvestre merged commit 2006639 into mozilla:master Nov 10, 2021
@luser
Copy link
Contributor

luser commented Nov 10, 2021

Terrific! Did you notice performance improvements or regressions?

FWIW, if you update the version of sccache used by Firefox to this revision in the toolchain definition, the build time metrics ought to be enough to surface any significant perf difference. sccache gets invoked a few thousand times in each Firefox build, it adds up!

Marwes pushed a commit to Marwes/sccache that referenced this pull request Nov 12, 2021
Followup on mozilla#985 to pull sccache the last little bit into tokio 1.0
Marwes pushed a commit to Marwes/sccache that referenced this pull request Nov 12, 2021
Followup on mozilla#985 to pull sccache the last little bit into tokio 1.0
sylvestre pushed a commit that referenced this pull request Nov 13, 2021
Followup on #985 to pull sccache the last little bit into tokio 1.0
drahnr pushed a commit to paritytech/cachepot that referenced this pull request Nov 13, 2021
Followup on mozilla/sccache#985 to pull sccache the last little bit into tokio 1.0
drahnr pushed a commit to paritytech/cachepot that referenced this pull request Nov 13, 2021
Followup on mozilla/sccache#985 to pull sccache the last little bit into tokio 1.0
@Xanewok Xanewok deleted the migrate-to-stable-future branch November 13, 2021 22:19
drahnr pushed a commit to paritytech/cachepot that referenced this pull request Nov 15, 2021
Followup on mozilla/sccache#985 to pull sccache the last little bit into tokio 1.0
@luser
Copy link
Contributor

luser commented Nov 15, 2021

@luser i tried, seems okayish :)

Oh yeah, I forgot that the "build times" metric is difficult to use with sccache because caching inherently throws off the result. 😐

@glandium you should add some sort of build time / # of cache misses metric that's more stable in the face of caching. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants