Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Extract serde_core out of serde crate #2608

Open
wants to merge 43 commits into
base: master
Choose a base branch
from

Conversation

osiewicz
Copy link

@osiewicz osiewicz commented Sep 4, 2023

This PR implements approach suggested in #2584 by @epage and @soqb. Also #2181
There were two changes initially that were since reverted during the lifetime of this PR:

  • I've changed serde_derive to use serde_core instead of serde. This has since been reverted as it would be a breaking change.
  • Code examples in serde_core aliased serde_core as serde so that the docs reexported in serde refer to serde and not serde_core. This however was not necessary following commit 2258bb4

Before (11.4s for a clean release build with derive feature, 8 cores):
obraz

After (7.8s for a clean release build with derive feature, 8 cores):
serde build graph after change (down to 8s)

@osiewicz
Copy link
Author

osiewicz commented Sep 4, 2023

Ugh, seems like there's a bunch of CI failures on set up job stage.

@epage
Copy link

epage commented Sep 4, 2023

Besides the gains here, this allows serde_json and friends to depend on serde_core, preventing packages using serde w/ derive from serializing things.

@NobodyXu
Copy link

There're merge conflicts on this branch again which needs to be resolved.

@osiewicz
Copy link
Author

osiewicz commented Oct 30, 2023

Hey, the merge conflicts on this PR are usually trivial to resolve (the PR itself doesn't change anything about the code in serde, just it's structure) and they'll pop up with each new serde version released; I am not sure if it makes sense to massage it all the time given that this PR has went almost unnoticed for 2 months now. I'd say it should be fine to resolve them once this PR gets some attention. =)

@NobodyXu
Copy link

Hmmm I thought this PR is blocked because merge conflicts and would get some attentions once it is resolved, but it turns out that it just doesn't get much attention regardless of merge conflicts.

This PR would really help improve compilation time of serde with derive enabled and I was looking forward to it, it's sad that it has been blocking for so long was no review from maintainer.

@osiewicz
Copy link
Author

osiewicz commented Oct 30, 2023

Yeah, I think so too. On the other hand, it was not asked for so no hard feelings there.
I would also love to see it merged.

But yeah, the conflicts will always be there whenever a new version of serde is released, as this PR moves half of the files into other directory (which git seems to be able to resolve conflicts for on it's own but GH can't see that) + the version number changes. Either way, if there's a consensus to merge it, we can just rebase the changes onto the current master and that should do it.

@oli-obk
Copy link
Member

oli-obk commented Oct 30, 2023

Two things:

  • Rebases are preferred over merges. A rebase that only conflicts due to file moves is usually resolved trivially without user input
  • nightly is broken, you may need to run with nightly, too and adjust some imports
  • I'm not going to push for this in the next two months as I'm going on leave soon
  • I can't count 😆

@osiewicz
Copy link
Author

Hey, should we come back to this PR whenever you're available then? By that I also mean whether I should fix it up in the meantime or is it fine to just come back to it in 2 months and rebase it then?

@oli-obk
Copy link
Member

oli-obk commented Oct 30, 2023

I think it would be fine to just not rebase it. Either @dtolnay will pick it up at whatever state it is in while I'm gone and ask you to rebase, or I will when I'm back.

@epage
Copy link

epage commented Nov 9, 2023

As a real-world example of how this can be beneficial, I was looking at the timings chart for ripgrep on the parallel frontend blog post. I did some analysis on it. I'm guessing this change on its own would take ripgrep build times on a 24-core machine from 9s to 6-7s.

@oli-obk oli-obk self-assigned this Feb 9, 2024
Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

Some things I would want to see further consideration of before deciding how to move forward with this design are:

  • the effect on compiler diagnostics seen by users, or other potential friction from this change, particularly when it comes to users who may be new to Rust

  • implications for serde 2.0 when it comes to the private helper types that serde_derive relies on

  • implications for serde 2.0 when it comes to wishlist for 2.0

(Maybe @oli-obk can work with you to flesh these out.)

Compiler diagnostics

After this change, what are the situations in which diagnostics that used to look like "the trait bound `T: serde::ser::Serialize` is not satisfied" will instead say "the trait bound `T: serde_core::ser::Serialize` is not satisfied"? — these occur often when you are using a data format library like serde_json together with a data structure library whose Serde trait impls are behind a feature.

I am not familiar with the heuristics rustc uses to choose whether to display the trait path in these errors as serde::ser::Serialize or serde_core::ser::Serialize or serde_core::Serialize or just Serialize.

For errors that refer to serde, users have been able to unblock themselves well enough by finding that the crate's documentation mentions "serde support" behind a feature, or noticing in cargo add it printed that there is a serde feature, or noticing that the relevant impl serde::Serialize for ... in the crate's source code has a conditional compilation attribute.

A crate is unlikely to document that it has "serde_core support". To some users it will be obvious that a serde trait impl is interchangeable with a serde_core trait impl, but this isn't the case in general. For example thiserror and thiserror_core are unrelated in this way, serde_json and serde_json_core are unrelated, etc. Maybe consider whether serde_traits, or something else, would be better naming.

I think this is probably inevitable given that moving the traits to a different crate that is a dependency of serde is most likely required for ever publishing serde_derive 2.0. (Although semver trick is another option here, which log used successfully in the past.) So this one is a nonblocking concern. Nevertheless it would be good to understand if there is a need for any kind of followup in rustc. Serde is commonly one of the very first crates that someone new to Rust would first use, so there is a lot of leverage to scare users off of Rust with hours of frustration, likely greater than the cost of 3 seconds in a first-time build.

What mitigation options exist, and how do they compare? Is setting [lib] name = "serde" on serde_core better or worse?

Private helper types

Currently this PR has moved all of serde::__private into serde_core.

That may not fly. It means that someone is able to write:

serde = { package = "serde_core", version = "1" }
serde_derive = "1"

which then means all of serde_derive 1.0's helper types need to be present in serde_core in perpetuity, even all the ones serde_derive 2.0 does not use, even if serde 1.0 is not in the build graph. I have not measured recently, but these helper types account for a nontrivial part of the compile time of serde.

What alternative options exist, and how do they compare? Keep helper types in serde for now? Put helper types in a third crate somehow? Some feature voodoo? Progress on rust-lang/rust#54363 would be valuable here too.

Items from 2.0 wishlist

Relocating traits into a different crate is a one-time deal. When we do serde_derive 2.0 (in which I'd like to remove support for many attributes that are too niche, and some changes like default = $expr should take an expression instead of a string literal) and serde 2.0 to go with it, it needs to be written against serde_core 1.0 because the cost–benefit from improving the macro is a lot better than the cost–benefit of breaking the ecosystem with incompatible traits.

That means some of what might be on the table for serde 2.0 needs to be accounted for now. In particular, this would be the time to figure out what(whether) to do about #1070. Do the Serialize and Deserialize impl for Result need to be behind a feature that is enabled only by serde 1.0 and not 2.0? Missing that now would be a blunder because it cannot be fixed later.

Hopefully that one is the only 2.0 item needing consideration, but I am not sure.

serde/Cargo.toml Outdated Show resolved Hide resolved
serde_core/Cargo.toml Outdated Show resolved Hide resolved
serde_core/Cargo.toml Outdated Show resolved Hide resolved
serde_core/Cargo.toml Outdated Show resolved Hide resolved
serde_core/Cargo.toml Outdated Show resolved Hide resolved
serde_core/Cargo.toml Outdated Show resolved Hide resolved
serde_core/Cargo.toml Outdated Show resolved Hide resolved
serde_core/LICENSE-APACHE Show resolved Hide resolved
serde_core/src/lib.rs Outdated Show resolved Hide resolved
The actual  version is 1.0.197 and not 1.0.190, but I've kept it consistent before rebasing.
@epage
Copy link

epage commented Feb 26, 2024

What alternative options exist, and how do they compare? Keep helper types in serde for now? Put helper types in a third crate somehow? Some feature voodoo? Progress on rust-lang/rust#54363 would be valuable here too.

My assumption for this is the helpers would be in serde and we keep existing derive users working the same way we do today.

  • No breaking change for now
  • Isolate the semver compatibility guarentees

Maybe with serde@2, we could switch to serde_derive + serde_derive_macro where serde_derive has the helpers but doing that is dependent on a lot of different factors that don't seem worth discussing atm.

serde_derive/Cargo.toml Outdated Show resolved Hide resolved
@osiewicz
Copy link
Author

osiewicz commented Feb 29, 2024

Just to add to second example provided by @epage; it looks like rustc may sometimes pick up a different path to (De)serialize trait, based on whether user has serde in non-macro scope.
Baseline code:

use serde;

fn main() {
    let _ = u32::deserialize("");
}
Stderr with `use serde`
error[E0599]: no function or associated item named `deserialize` found for type `u32` in the current scope
 --> src/main.rs:7:23
  |
7 |     let _: u32 = u32::deserialize(());
  |                       ^^^^^^^^^^^ function or associated item not found in `u32`
  |
  = help: items from traits can only be used if the trait is in scope
help: the following trait is implemented but not in scope; perhaps add a `use` for it:
  |
2 + use serde::Deserialize;
  |

For more information about this error, try `rustc --explain E0599`.
warning: `throwaway_serde` (bin "throwaway_serde") generated 1 warning
error: could not compile `throwaway_serde` (bin "throwaway_serde") due to previous error; 1 warning emitted
Stderr with `use serde_core`
error[E0599]: no function or associated item named `deserialize` found for type `u32` in the current scope
 --> src/main.rs:7:23
  |
7 |     let _: u32 = u32::deserialize(());
  |                       ^^^^^^^^^^^ function or associated item not found in `u32`
  |
  = help: items from traits can only be used if the trait is in scope
help: the following trait is implemented but not in scope; perhaps add a `use` for it:
  |
2 + use serde_core::Deserialize;
  |

For more information about this error, try `rustc --explain E0599`.
warning: `throwaway_serde` (bin "throwaway_serde") generated 1 warning
error: could not compile `throwaway_serde` (bin "throwaway_serde") due to previous error; 1 warning emitted

When both serde and serde_core are used, the one that's imported first is the one that's suggested in diagnostics.

As for the first example, could it be that rustc always picks "local" path to a (De)serialize trait, since the failure here is related to monomorphization/typeck?

Comment on lines 228 to 231
// Used by generated code and doc tests. Not public API.
#[doc(hidden)]
#[path = "private/mod.rs"]
pub mod __private;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also run compilation time tests while keeping this in serde, so we have some concrete numbers to talk about

Copy link
Author

@osiewicz osiewicz Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like to keep private in serde, we'd have to figure out what to do about __deserialize_content in Deserializer trait. private::de is a substantial contributor to private's line count, so I guess having some of private in serde and then rest in serde_core wouldn't quite cut it.
Instead, I've added a new feature to serde_core (private) that enables compilation of serde_core::private::{de, ser} module (kinda like no_serde_derive rustc cfg). serde_core::private is then enabled with serde::derive.

I've also updated other benchmarks, as one in PR description was taken on a relatively old machine I don't have access to right now. All timings are for clean release build with 10 cores
Master (derive=true, 3.8s)
image
Master (derive=false, 1.8s)
image

This branch, commit 49d028d, derive=true, 2.4s:
image
This branch, patch
private_feature.patch, derive=false, 1.5s:
image

So all in all, moving private from serde_core to serde would net us about 200ms out of 1400ms on M1 Max.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, let's do that then!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern about having private in serde_core is that it requires continued strict coupling between serde_core and serde_derive. For example, I suspect this would make having a serde_derive v2 more difficult because you couldn't run it in parallel to serde_derive v1 without much finagling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood it as "moving private to serde" is a net gain, so that's what I meant with "let's do that":

Let's move it to serde and out of serde_core entirely

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my intent was never to push for having private in serde_core, it's just that having it in serde is apparently not as trivial as moving source files around, hence I went the other way with private feature to gauge the impact of not having it in serde_core. Given that we're fine with the benefits that it brings, I will poke at moving private to serde.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey,
Just to give an update, I've encountered an issue with non-public API of Deserialize (__deserialize_content); should it still be a part of serde_core? If so, Content type would have to be in serde_core as well, which isn't ideal due to reasons outlined by dtolnay. I've tried numerous angles to remove __deserialize_content from Deserialize (and somehow move it's implementation into serde), without luck.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we'll have to revert #2148 for now and reopen the issue it closed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alrighty, private has been moved into serde.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants