Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustdoc: Add unstable option to only emit shared/crate-specific files #83478

Merged
merged 5 commits into from Apr 2, 2021

Conversation

jyn514
Copy link
Member

@jyn514 jyn514 commented Mar 25, 2021

The intended use case is for docs.rs, which can now copy exactly the
files it cares about, rather than having to guess based on whether they
have a resource suffix or not. In particular, some files have a resource
suffix but cannot be shared between crates: rust-lang/docs.rs#1312 (comment)

The end goal is to fix rust-lang/docs.rs#1327 by reverting rust-lang/docs.rs#1324.

This obsoletes --print=unversioned-files, which I plan to remove as
soon as docs.rs stops using it.

I recommend reviewing this one commit at a time.

r? @GuillaumeGomez cc @Nemo157 @pietroalbini

This cleans up the code quite a bit, and also makes the next commit much
easier.
@jyn514 jyn514 added T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. T-docs-rs Relevant to the docs-rs subteam, which will review and decide on the PR/issue. labels Mar 25, 2021
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 25, 2021
@rust-log-analyzer

This comment has been minimized.

@Mark-Simulacrum
Copy link
Member

I admit I'm a bit worried that some of the files labeled as UNVERSIONED here are such -- what do we lose by not versioning them with the toolchain? It seems like we're shipping them alongside Rust, so they're pretty clearly toolchain-specific and can change over time. If docs.rs or similar services want to deduplicate these files for storage purposes, and we want to make that easier, then emitting the files with a hash in their name (i.e., one dependent on the file contents) seems like the right path: then the filename uniquely identifies the file, and as such that file is unversioned in the sense that no matter what toolchain emits it will have the same contents.

Maybe I'm missing some history here or context, though; I'm a bit worried that we're adding (I guess unstable, but still) surface area to support use cases without documenting the design space a bit more -- maybe that's happened somewhere though? A link would be great!


With regards to the testing strategy for this, you asked on Zulip if you can check that a test generates a given file, but that doesn't fit the expectation I'd have? In particular, I'm thinking that what is desired is to make sure the output from --print precisely describes all files emitted, not that some particular ones are included; that would prevent the failure mode of forgetting a file in print, but also the failure mode of emitting a filename that was not actually written to disk. I'd probably use a run-make test for this, still, but it feels like the right thing to me.

@jyn514
Copy link
Member Author

jyn514 commented Mar 25, 2021

I admit I'm a bit worried that some of the files labeled as UNVERSIONED here are such -- what do we lose by not versioning them with the toolchain? It seems like we're shipping them alongside Rust, so they're pretty clearly toolchain-specific and can change over time.

I would rather not deal with this here - it's not causing bugs for docs.rs currently, and it's a very long-standing bug. This PR doesn't change those filenames at all, it just lets docs.rs list them separately from crate-specific files.

If docs.rs or similar services want to deduplicate these files for storage purposes, and we want to make that easier, then emitting the files with a hash in their name (i.e., one dependent on the file contents) seems like the right path: then the filename uniquely identifies the file, and as such that file is unversioned in the sense that no matter what toolchain emits it will have the same contents.

Honestly I would be ok with this, but it doesn't make docs.rs' life any better, it still has to deal with the complexity from previous builds. We discussed this briefly on discord but I don't think we came to any conclusions.

Maybe I'm missing some history here or context, though; I'm a bit worried that we're adding (I guess unstable, but still) surface area to support use cases without documenting the design space a bit more -- maybe that's happened somewhere though? A link would be great!

It was mostly discussed on Discord, I'll copy/paste the summary here.

I do want to say that I consider this "super unstable" and would probably close bugs about it if they didn't come directly from the docs.rs team. It's not intended to ever be stabilized.


  1. rustdoc added a new crates.js file a few weeks ago, breaking search for all newly built crates Search within crate is broken: "Uncaught TypeError: crates is undefined" docs.rs#1300
  2. dtolnay added crates.js and also some other essential files that were missing Add crates.js and woff2 fonts to required files docs.rs#1301
  3. it turns out the other essential files had a typo, breaking all builds Static resources corresponding to 20210305-1.52.0-nightly-caca2121f are missing docs.rs#1305. the typo was fixed by Fix spelling of FiraSans-*.woff2 docs.rs#1308
  4. rustdoc wanted to start adding more essential files (Remove theme.js file #82732, Update Source Code Pro and include italics #80705) and I got fed up of dealing with it so I opened Don't hard-code essential rustdoc files docs.rs#1312
  5. getopts #1289 + cleanup after the std/core split #1312 had a bug (Don't hard-code essential rustdoc files docs.rs#1312 (comment)) if we copy more files than necessary. I opened Give precedence to local shared files over global ones docs.rs#1324 giving precedence to local shared files over global ones which avoids the bug.
  6. Remove stack alignment from core::task::unsafe_spawn_inner #1324 broke docs that weren't using --extern-html-root-url (Docs for releases during 2018-{10-12} appear unstyled docs.rs#1327 (comment)).

Some possible solutions are to

  • Have docs.rs hard-code files we know can't be shared, like the search-index. That's an alternative approach to 5 which reverts Remove stack alignment from core::task::unsafe_spawn_inner #1324 and instead only copies the files that are necessary.
  • Have docs.rs serve shared files from subdirectories if the crate was built with one of the broken nightlies. That builds on Remove stack alignment from core::task::unsafe_spawn_inner #1324 and only special cases the affected nightlies.
  • Rebuild all affected crates (over 13000 releases, probably not practical).
  • Instead of using --print=unversioned-files, add a new --print=static-root-files that prints all static root filenames - basically this PR, but using --print instead of --emit and still unconditionally generating all files.
  • Add --emit=static-root-files (this PR).

In particular, I'm thinking that what is desired is to make sure the output from --print precisely describes all files emitted, not that some particular ones are included; that would prevent the failure mode of forgetting a file in print, but also the failure mode of emitting a filename that was not actually written to disk. I'd probably use a run-make test for this, still, but it feels like the right thing to me.

This PR doesn't change --print at all. The idea is that docs.rs can pass --emit=unversioned-shared-resources,toolchain-shared-resources and then copy the whole doc directory unconditionally.

@jyn514
Copy link
Member Author

jyn514 commented Mar 25, 2021

A summary of the summary: basically the issue is that we can't copy fewer shared files than rustdoc generates, and due to 2 months of broken builds about 3 years ago, we also can't copy more. So we have to know exactly which files should be shared and which shouldn't.

Copy link
Member

@GuillaumeGomez GuillaumeGomez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me. So once the debate with @Mark-Simulacrum is done, you have my approval. ;)

@Mark-Simulacrum
Copy link
Member

Ok, I have more questions, but ultimately don't feel the need to try to insert myself into all that :)

Is src/test/run-make/emit-shared-files/unversioned-files.txt used somewhere? A simple ctrl+f for the filename didn't seem to turn anything up, so I'm wondering if it's just a stray file or I missed something while looking at the test.

It also feels like the test is sort of likely to miss things in the sense that it's not exhaustively checking the files that were emitted; it appears to only verify that specific files either exist or don't. Is that intentional?

The intended use case is for docs.rs, which can now copy exactly the
files it cares about, rather than having to guess based on whether they
have a resource suffix or not. In particular, some files have a resource
suffix but cannot be shared between crates: rust-lang/docs.rs#1312 (comment)

The end goal is to fix rust-lang/docs.rs#1327
by reverting rust-lang/docs.rs#1324.

This obsoletes `--print=unversioned-files`, which I plan to remove as
soon as docs.rs stops using it.
@jyn514
Copy link
Member Author

jyn514 commented Mar 26, 2021

Is src/test/run-make/emit-shared-files/unversioned-files.txt used somewhere? A simple ctrl+f for the filename didn't seem to turn anything up, so I'm wondering if it's just a stray file or I missed something while looking at the test.

Oops, thanks - I copied this from src/test/run-make-fulldeps/print-unversioned-files and forgot to delete the files I didn't need. Should be fixed now.

It also feels like the test is sort of likely to miss things in the sense that it's not exhaustively checking the files that were emitted; it appears to only verify that specific files either exist or don't. Is that intentional?

Yes, for two reasons:

  1. I don't think it's useful to exhaustively test every file, docs.rs shouldn't be depending on the exact files anyway.
  2. This isn't meant to test those specific files, just that those categories of files are tested. In other words, if rustdoc changes crates.js in the future to only depend on the toolchain, not the crate, that would be a reasonable change and there's no need for the test to block it. Would it be helpful to add some comments with the category the file comes from?

@jyn514
Copy link
Member Author

jyn514 commented Mar 26, 2021

@Mark-Simulacrum did you want to take another look at this before it merges? You're not assigned to the PR, I just wanted to make sure it's not lost in notifications.

@Mark-Simulacrum
Copy link
Member

I would, let me assign myself.

@Mark-Simulacrum
Copy link
Member

I am OK with this proceeding; r=GuillaumeGomez, as you would like, I suppose.

Ultimately I think the tests / implementation here doesn't match what I would expect (or I don't understand it), but I'm also not really a rustdoc maintainer, and so I won't block this further I suppose.

This isn't meant to test those specific files, just that those categories of files are tested. In other words, if rustdoc changes crates.js in the future to only depend on the toolchain, not the crate, that would be a reasonable change and there's no need for the test to block it. Would it be helpful to add some comments with the category the file comes from?

I think my confusion here is that I think the added test isn't checking the "real problem" which is when a file is in the wrong category. For example, the comment in the code about docs.rs not touching default theme settings (I think) is not reflected at all in the test (e.g., as a currently failing test with a fixme comment); that worries me, as it suggests we would easily pick up more such bugs (or they may exist today). It seems like ideally there would be some way to pretty statically detect which category a file is in, e.g., if there's anything other than a cp of it, it's likely crate-specific. If it's cp'd from a include_str! or similar with no modification possible, it's toolchain-specific.

Unversioned files make no sense to me so I can't really comment there. IMO, it's a bug to have them and I'm surprised that this PR can't fix that, as it suggests this framework is insufficiently general - I would expect that with this PR, rustdoc / docs.rs would no longer have to coordinate at all on whether files are versioned or not. Is that not correct?

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 31, 2021
@jyn514
Copy link
Member Author

jyn514 commented Mar 31, 2021

For example, the comment in the code about docs.rs not touching default theme settings (I think) is not reflected at all in the test (e.g., as a currently failing test with a fixme comment); that worries me, as it suggests we would easily pick up more such bugs (or they may exist today).

The problem is that custom themes don't fit into the toolchain/crate framework: they depend on the invocation. I don't know what test you'd want: the problem is that if you run

rustdoc --theme x.css lib.rs
mv y.css x.css
rustdoc --theme x.css lib.rs

then the original x.css gets overriden. But docs.rs doesn't do that (it doesn't use --theme at all). And I don't know if that's a bug as such, it only turns into a bug if you expected the two files to be the same. I would rather not make this feature more complicated than necessary since it's only needed for docs.rs.

It seems like ideally there would be some way to pretty statically detect which category a file is in, e.g., if there's anything other than a cp of it, it's likely crate-specific. If it's cp'd from a include_str! or similar with no modification possible, it's toolchain-specific.

I'm not sure what this would look like - do you want me to add a &'static bound to write_minify and write_toolchain, and a generate_content: fn() -> OsString argument to write_crate? That would work for everything except the custom theme, which I don't mind special-casing.

I would expect that with this PR, rustdoc / docs.rs would no longer have to coordinate at all on whether files are versioned or not. Is that not correct?

Oh hmm, this is a good point. I can't remove unversioned files until docs.rs stops using --print=unversioned-files, but I can remove them at the same time I remove --print :)

@Mark-Simulacrum
Copy link
Member

I don't know whether the 'static bound makes sense, but something like that feels closer to the 'truth' in the sense that the goal I'd personally have is that you can't get it wrong.

The theme CLI arg being 'special' feels a bit off -- surely there are other similar arguments? e.g., the URL root (forget name), which affects other stuff too.

I think my expectation is that the crate-versioned files are actually (crate, toolchain, invocation)-versioned, right? That seems entirely reasonable to me, but isn't really reflected in docs today; with that change args fall into it quite reasonably I think, including theme.

@jyn514
Copy link
Member Author

jyn514 commented Mar 31, 2021

The theme CLI arg being 'special' feels a bit off -- surely there are other similar arguments? e.g., the URL root (forget name), which affects other stuff too.

Well right but the point I'm trying to make is that it doesn't matter for docs.rs. Docs.rs doesn't need the files to be stable across invocations, it just needs to know when the cache should be invalidated.

I guess it is a good point that someone could add --theme x in their RUSTDOCFLAGS in package.metadata.docs.rs, which makes this "crate-specific" in the sense that docs.rs could pass rustdoc different arguments depending on the crate. But it's not something rustdoc knows about.

@Mark-Simulacrum
Copy link
Member

I'm trying to understand your comment and I think I'm missing something.

If docs.rs only cared about cache invalidation, then you'd not need 3 categories, you could use the hash of the file to know if it needs invalidation; we are intending to use the categories (AIUI) to avoid such a scheme by assuming stability for some class of inputs (e.g., toolchain files are assumed stable across all crates). If that condition is violated, I assume that docs.rs will see bugs (e.g., broken rendering due to mismatched files).

That to me implies that it is in fact critical that we get these categories right, including in the presence of RUSTDOCFLAGS etc, as set by the user via metadata.

So theme should be a crate-specific file, because it depends on the invocation, right? Same as the docs for that crate are, realistically.

As I see it, the categories are defined as follows:

  • unversioned: these files we "promise" never change, even across toolchain upgrades (will be removed soon)
  • toolchain: these files are independent of invocation (including the crate)
  • crate-specific: these files are dependent on invocation, including crate, and obviously the toolchain

If this is inaccurate or does not match your expectations, it seems good to address that. These categories also imply stability for unchanged right hand sides as dictated per category; if that is not met, the file is in the wrong category IMO.


toolchain-only:
$(RUSTDOC) -Z unstable-options --emit=toolchain-shared-resources --output $(TOOLCHAIN_ONLY) --resource-suffix=-xxx x.rs
[ -e $(TOOLCHAIN_ONLY)/storage-xxx.js ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not toolchain specific, as it contains the resource suffix (https://github.com/Mark-Simulacrum/rust/blob/e423058751a2b098d3e469a8e6df1b7a8bbd67b6/src/librustdoc/html/render/write_shared.rs#L162-L166), right? It should go in the invocation specific category.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs.rs always uses a resource suffix that only depends on the toolchain, not anything else. Using --resource-suffix with rustflags in package.metadata.docs.rs will massively break things anyway, it's not supported.


let mut themes: Vec<&String> = themes.iter().collect();
themes.sort();

write_minify(
&cx.shared.fs,
cx.path("main.js"),
"main.js",
&static_files::MAIN_JS.replace(
"/* INSERT THEMES HERE */",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also implies this is not toolchain, but rather invocation, specific.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, I guess so. Do you mind if I leave it for later? Docs.rs was caching it before anyway: https://github.com/rust-lang/docs.rs/pull/1312/files#diff-b06b44e6583254ce69af8cedabc04a73a97599d593676e1bbad7ee442240ccf1L35 and it should only matter if someone passes a custom --theme, it seems a shame to copy it every time when it so rarely matters.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.


let mut themes: Vec<&String> = themes.iter().collect();
themes.sort();

write_minify(
&cx.shared.fs,
cx.path("main.js"),
"main.js",
&static_files::MAIN_JS.replace(
"/* INSERT THEMES HERE */",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, I guess so. Do you mind if I leave it for later? Docs.rs was caching it before anyway: https://github.com/rust-lang/docs.rs/pull/1312/files#diff-b06b44e6583254ce69af8cedabc04a73a97599d593676e1bbad7ee442240ccf1L35 and it should only matter if someone passes a custom --theme, it seems a shame to copy it every time when it so rarely matters.

src/librustdoc/html/render/write_shared.rs Outdated Show resolved Hide resolved
@Mark-Simulacrum
Copy link
Member

I don't mind leaving things for later on either of the problems, but I do think it'd be worth filing a tracking issue for this feature, and documenting the problems with it there as well as the reason we're adding it.

@jyn514
Copy link
Member Author

jyn514 commented Mar 31, 2021

I do think it'd be worth filing a tracking issue for this feature, and documenting the problems with it there as well as the reason we're adding it.

A tracking issue seems like a bad fit for something we never plan to stabilize. What do you think about putting it in the unstable book instead?

@Mark-Simulacrum
Copy link
Member

I think the important bit is that the known limitations are documented in a way that is readily available; imo today tracking issues solve that better. I think it's reasonable to track unstable features even if in there current incarnation they won't get stabilized, particularly when they are sort of intended for third party use (in this case rust project internal, i.e. docs.rs).

@jyn514
Copy link
Member Author

jyn514 commented Apr 2, 2021

I'm still a little unsure what the tracking issue is for, but the docs have been broken for about 10 days now and I would really like to land this, so I opened #83784.

@rustbot label: -S-waiting-on-author +S-waiting-on-review

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 2, 2021
@Mark-Simulacrum
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Apr 2, 2021

📌 Commit 413938d has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 2, 2021
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Apr 2, 2021
…imulacrum

rustdoc: Add unstable option to only emit shared/crate-specific files

The intended use case is for docs.rs, which can now copy exactly the
files it cares about, rather than having to guess based on whether they
have a resource suffix or not. In particular, some files have a resource
suffix but cannot be shared between crates: rust-lang/docs.rs#1312 (comment)

The end goal is to fix rust-lang/docs.rs#1327 by reverting rust-lang/docs.rs#1324.

This obsoletes `--print=unversioned-files`, which I plan to remove as
soon as docs.rs stops using it.

I recommend reviewing this one commit at a time.

r? `@GuillaumeGomez` cc `@Nemo157` `@pietroalbini`
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 2, 2021
Rollup of 7 pull requests

Successful merges:

 - rust-lang#83065 (Rework `std::sys::windows::alloc`)
 - rust-lang#83478 (rustdoc: Add unstable option to only emit shared/crate-specific files)
 - rust-lang#83629 (Fix double-drop in `Vec::from_iter(vec.into_iter())` specialization when items drop during panic)
 - rust-lang#83673 (give full path of constraint in suggest_constraining_type_param)
 - rust-lang#83755 (Simplify coverage tests)
 - rust-lang#83757 (2229: Support migration via rustfix)
 - rust-lang#83771 (Fix stack overflow detection on FreeBSD 11.1+)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 31f5320 into rust-lang:master Apr 2, 2021
@rustbot rustbot added this to the 1.53.0 milestone Apr 2, 2021
@jyn514 jyn514 deleted the fine-grained-files branch April 3, 2021 23:16
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Apr 5, 2021
Give a better error when --theme is not a CSS file

Before:

```
error: invalid argument: "bacon.toml"
```

After:
```
error: invalid argument: "bacon.toml"
  |
  = help: arguments to --theme must be CSS files
```

cc rust-lang#83478
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-docs-rs Relevant to the docs-rs subteam, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docs for releases during 2018-{10-12} appear unstyled
8 participants