Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

soft-sha512 code size seems unreasonably high on thumbv7em #561

Open
TomCrypto opened this issue Feb 9, 2024 · 1 comment
Open

soft-sha512 code size seems unreasonably high on thumbv7em #561

TomCrypto opened this issue Feb 9, 2024 · 1 comment

Comments

@TomCrypto
Copy link

I'm working on an embedded project which needs ed25519 signing, which pulls in sha2 for the sha512 step of the signing procedure. On an thumbv7em target, the sha512 implementation appears to consume a very significant chunk of code space:

$ cargo bloat --release --filter sha2
File .text    Size Crate Name
0.9% 17.0% 26.8KiB  sha2 sha2::sha512::compress512
0.2%  4.4%  6.9KiB  sha2 sha2::sha256::compress256
0.0%  0.0%      0B       And 0 smaller methods. Use -n N to show more.
1.1% 21.4% 33.7KiB       filtered data size, the file size is 3.0MiB

If I use opt-level = "z" it is better, but still pretty high:

$ cargo bloat --release --filter sha2
File .text    Size Crate Name
0.5% 10.1% 10.8KiB  sha2 sha2::sha512::compress512
0.2%  3.0%  3.2KiB  sha2 sha2::sha256::compress256
0.0%  0.2%    250B  sha2 sha2::sha512::soft::sha512_schedule_x2
0.0%  0.2%    210B  sha2 sha2::sha512::soft::sha512_digest_round
0.0%  0.1%    162B  sha2 sha2::sha256::soft::schedule
0.0%  0.1%    162B  sha2 sha2::sha256::soft::sha256_digest_round_x2
0.0%  0.0%     32B  sha2 core::iter::adapters::zip::TrustedRandomAccessNoCoer...
0.0%  0.0%      0B       And 0 smaller methods. Use -n N to show more.
0.7% 13.8% 14.8KiB       filtered data size, the file size is 2.0MiB

The amounts above seem quite onerous for my target which only has 256kB of flash, and would potentially be a non-starter for targets with even less code storage available.

I suspect it's probably a combination of inlined soft 64-bit integer arithmetic and extreme levels of code generation due to macro expansion.

If improving the code size would be a performance regression on some platforms perhaps an implementation favoring code size gated behind a crate feature flag could be of interest? Like a "32-bit-friendly" version or something I guess.

@newpavlov
Copy link
Member

newpavlov commented Feb 9, 2024

#547 may reduce size a bit, but the main reason for the big code size is likely aggressive inlining of round processing code which we use. Our block compressing function compiles down to a completely branchless code, which is usually what we want, but it's not desirable for constrained targets.

It may be worth introduce a no_unroll flag similar to one in the keccak crate. We will gladly accept such PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants