Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split up crates so end users can get faster compile times #616

Open
kevinheavey opened this issue Jan 21, 2024 · 9 comments
Open

Split up crates so end users can get faster compile times #616

kevinheavey opened this issue Jan 21, 2024 · 9 comments

Comments

@kevinheavey
Copy link

kevinheavey commented Jan 21, 2024

I have noticed I have some crates that need only the decompress::step_1 function in curve25519-dalek, but they have to pull in the whole dependency, which takes a long time to compile. Some rough experiments on my machine found that stripping away the stuff I don't need reduces compile times for the crate by ~90%.

I believe I have a similar situation with the ed25519-dalek crate, so before getting stuck on the details of what code could be extracted into a new crate (and re-exported for backwards compatibility), I'd like to check if this is something the maintainers are open to.

(Something similar could be achieved with features but it would be harder and messier in my opinion)

@tarcieri
Copy link
Contributor

I have noticed I have some crates that need only the decompress::step_1 function in curve25519-dalek

Your use case sounds rather atypical

but they have to pull in the whole dependency, which takes a long time to compile

Timing a build of both curve25519-dalek and ed25519-dalek from a fresh clone with no target/ directory to reuse on my machine:

$ time cargo build --release
    Updating crates.io index
   Compiling version_check v0.9.4
   Compiling typenum v1.17.0
   Compiling semver v1.0.21
   Compiling libc v0.2.152
   Compiling platforms v3.3.0
   Compiling cfg-if v1.0.0
   Compiling subtle v2.5.0
   Compiling signature v2.2.0
   Compiling zeroize v1.7.0
   Compiling ed25519 v2.2.3
   Compiling generic-array v0.14.7
   Compiling rustc_version v0.4.0
   Compiling curve25519-dalek v4.1.1 (/private/tmp/curve25519-dalek/curve25519-dalek)
   Compiling cpufeatures v0.2.12
   Compiling block-buffer v0.10.4
   Compiling crypto-common v0.1.6
   Compiling digest v0.10.7
   Compiling sha2 v0.10.8
   Compiling ed25519-dalek v2.1.0 (/private/tmp/curve25519-dalek/ed25519-dalek)
    Finished release [optimized] target(s) in 2.42s
cargo build --release  5.52s user 0.69s system 249% cpu 2.485 total

2.4 seconds on my computer. I wouldn't consider that "a long time".

@kevinheavey
Copy link
Author

kevinheavey commented Jan 21, 2024

On my computer it takes 8.8 seconds. And I find GitHub Actions tends to be about 4x slower than my machine.

   Compiling proc-macro2 v1.0.78
   Compiling version_check v0.9.4
   Compiling unicode-ident v1.0.12
   Compiling typenum v1.17.0
   Compiling semver v1.0.21
   Compiling platforms v3.3.0
   Compiling cfg-if v1.0.0
   Compiling cpufeatures v0.2.12
   Compiling subtle v2.5.0
   Compiling signature v2.2.0
   Compiling rand_core v0.6.4
   Compiling ed25519 v2.2.3
   Compiling generic-array v0.14.7
   Compiling rustc_version v0.4.0
   Compiling curve25519-dalek v4.1.1 (/home/user/tmp/curve25519-dalek/curve25519-dalek)
   Compiling quote v1.0.35
   Compiling syn v2.0.48
   Compiling crypto-common v0.1.6
   Compiling block-buffer v0.10.4
   Compiling digest v0.10.7
   Compiling sha2 v0.10.8
   Compiling zeroize_derive v1.4.2
   Compiling curve25519-dalek-derive v0.1.1 (/home/user/tmp/curve25519-dalek/curve25519-dalek-derive)
   Compiling zeroize v1.7.0
   Compiling x25519-dalek v2.0.0 (/home/user/tmp/curve25519-dalek/x25519-dalek)
   Compiling ed25519-dalek v2.1.0 (/home/user/tmp/curve25519-dalek/ed25519-dalek)
    Finished release [optimized] target(s) in 8.83s

Your use case sounds rather atypical

The use case is the Solana Pubkey type, which needs to check if bytes are a curve point but otherwise doesn't need curve25519-dalek

@tarcieri
Copy link
Contributor

If that's all you need, you shouldn't be using ed25519-dalek at all. You can decompress an Edwards curve point with curve25519-dalek alone.

I'm not sure how your proposed solution would help either. You seem more in need of feature gating than more crates.

More crates help improve parallelism and can reduce the algorithmic complexity of compiling any one crate, and we do use them in spades as part of the @RustCrypto project, to the point people complain about the sheer number of dependencies. But at the end of the day, you're still compiling the same amount of code.

To really micro-optimize for your use case, you'd probably need feature gating to be able to pull in just the field implementation alone. But that's also something we deliberately avoid exposing the FieldElement type.

@kevinheavey
Copy link
Author

curve25519-dalek takes 7.5 seconds to compile on my machine:

   Compiling semver v1.0.21
   Compiling proc-macro2 v1.0.78
   Compiling unicode-ident v1.0.12
   Compiling platforms v3.3.0
   Compiling cfg-if v1.0.0
   Compiling cpufeatures v0.2.12
   Compiling subtle v2.5.0
   Compiling zeroize v1.7.0
   Compiling rustc_version v0.4.0
   Compiling curve25519-dalek v4.1.1 (/home/user/tmp/curve25519-dalek/curve25519-dalek)
   Compiling quote v1.0.35
   Compiling syn v2.0.48
   Compiling curve25519-dalek-derive v0.1.1 (/home/user/tmp/curve25519-dalek/curve25519-dalek-derive)
    Finished release [optimized] target(s) in 7.54s

I'm not sure how your proposed solution would help either. You seem more in need of feature gating than more crates.

Features can also achieve the end result but I personally find that crates are cleaner and have fewer footguns (e.g. it's easy to forget to check a certain crate feature)

But at the end of the day, you're still compiling the same amount of code.

This is about being able to compile less code, by having a more lightweight dependency than curve25519-dalek

@tarcieri
Copy link
Contributor

Again, the only way we could micro-optimize for your use case is by creating a crate that contains something we're deliberately trying to avoid exposing

@kevinheavey
Copy link
Author

One way I've seen projects address this shortcoming in Rust visibility is to publish an "internals" crate and put a notice in the README that you shouldn't use the crate directly e.g. polars-core.

Additionally, an internals crate could have a default feature that prints a warning if someone tries to use the crate without disabling the feature.

@tarcieri
Copy link
Contributor

tarcieri commented Feb 1, 2024

@kevinheavey you haven't mentioned specifically what you're doing beyond needing decompress::step_1, but it seems like e.g. solving the curve equation for a FieldElement isn't really going to be useful without the higher level types in curve25519-dalek.

What are you doing exactly?

@rozbb
Copy link
Contributor

rozbb commented Feb 2, 2024

If the only goal is to validate Ed points and that's it, then I think you need a super fine grained slice of this crate. I think the most appropriate thing might be to just copy the u32 backend, copy the decompress::step_1 function, and that's it.

If we had feature gates that fine, we'd have a lot of feature gates

@kevinheavey
Copy link
Author

kevinheavey commented Feb 2, 2024

@kevinheavey you haven't mentioned specifically what you're doing beyond needing decompress::step_1, but it seems like e.g. solving the curve equation for a FieldElement isn't really going to be useful without the higher level types in curve25519-dalek.

What are you doing exactly?

@tarcieri the Solana Pubkey struct has a find_program_address method that iterates until it finds an address that is not on the ed25519 curve and hence cannot have an associated private key. The only part of curve25519-dalek it uses for this is:

        curve25519_dalek::edwards::CompressedEdwardsY::from_slice(_bytes.as_ref())
            .decompress()
            .is_some()

and if you pare that down a bit further it turns out it only needs the step_1 function. Or even more specifically, just the first element of the tuple returned by that function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants