Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sha2 automatic cpu feature detection? #319

Closed
mat-gas opened this issue Sep 15, 2021 · 4 comments
Closed

sha2 automatic cpu feature detection? #319

mat-gas opened this issue Sep 15, 2021 · 4 comments

Comments

@mat-gas
Copy link

mat-gas commented Sep 15, 2021

Hi,

I'm not really sure about the state of sha256 performances right now, if some backend is autodetected or not, but here's what I get:

  • sha512 seems to be more performant than sha56
  • with or without force-soft feature: same performances
  • with asm feature: 30% better perfs for sha256, same perfs for sha512

CPU used is : Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz

TL;DR: is this normal and expected?
Feature autodetection for ciphers is a blessing (for instance for AES), I dunno if this is what you're aiming for here

anyway, keep up the good work and thanks again for all of this

no feature selected

# cargo +nightly bench -p sha2 
     Running unittests (target/release/deps/sha256-b112631a7dfdc1ab)

running 4 tests
test bench1_10    ... bench:          58 ns/iter (+/- 10) = 172 MB/s
test bench2_100   ... bench:         538 ns/iter (+/- 21) = 185 MB/s
test bench3_1000  ... bench:       4,996 ns/iter (+/- 199) = 200 MB/s
test bench4_10000 ... bench:      49,216 ns/iter (+/- 8,895) = 203 MB/s

     Running unittests (target/release/deps/sha512-f8bfadf14edd04a3)

running 4 tests
test bench1_10    ... bench:          50 ns/iter (+/- 1) = 200 MB/s
test bench2_100   ... bench:         414 ns/iter (+/- 182) = 241 MB/s
test bench3_1000  ... bench:       2,269 ns/iter (+/- 32) = 440 MB/s
test bench4_10000 ... bench:      22,011 ns/iter (+/- 657) = 454 MB/s

force-soft

# cargo +nightly bench -p sha2  --features force-soft

     Running unittests (target/release/deps/sha256-806a1bb378054c86)

running 4 tests
test bench1_10    ... bench:          57 ns/iter (+/- 1) = 175 MB/s
test bench2_100   ... bench:         528 ns/iter (+/- 208) = 189 MB/s
test bench3_1000  ... bench:       4,901 ns/iter (+/- 129) = 204 MB/s
test bench4_10000 ... bench:      49,404 ns/iter (+/- 1,036) = 202 MB/s


     Running unittests (target/release/deps/sha512-474907d677bc31c1)

running 4 tests
test bench1_10    ... bench:          31 ns/iter (+/- 0) = 322 MB/s
test bench2_100   ... bench:         256 ns/iter (+/- 4) = 390 MB/s
test bench3_1000  ... bench:       2,473 ns/iter (+/- 77) = 404 MB/s
test bench4_10000 ... bench:      24,019 ns/iter (+/- 490) = 416 MB/s

asm

# cargo +nightly bench -p sha2  --features asm

     Running unittests (target/release/deps/sha256-001ce7f728ccfd14)

running 4 tests
test bench1_10    ... bench:          41 ns/iter (+/- 1) = 243 MB/s
test bench2_100   ... bench:         383 ns/iter (+/- 12) = 261 MB/s
test bench3_1000  ... bench:       3,763 ns/iter (+/- 118) = 265 MB/s
test bench4_10000 ... bench:      37,434 ns/iter (+/- 1,203) = 267 MB/s


     Running unittests (target/release/deps/sha512-f6d466348941563c)

running 4 tests
test bench1_10    ... bench:          30 ns/iter (+/- 1) = 333 MB/s
test bench2_100   ... bench:         259 ns/iter (+/- 6) = 386 MB/s
test bench3_1000  ... bench:       2,232 ns/iter (+/- 46) = 448 MB/s
test bench4_10000 ... bench:      21,327 ns/iter (+/- 355) = 468 MB/s
@newpavlov
Copy link
Member

sha2 by default uses autodetection. On x86 in addition to the software backend (which can be replaced with the ASM implementation from the sha2-asm crate) we have SHA-NI backend for SHA-256-based hashes and AVX2 backend for SHA-512-based ones.

In other words, your CPU has AVX2 extension, so SHA-512 gets SIMD accelerated, but SHA-256 uses the software backend because your CPU does not have SHA-NI extension. It could be worth looking into implementation of SSE backend for SHA-256 as described here.

@mat-gas
Copy link
Author

mat-gas commented Sep 15, 2021

Thanks for the prompt and precise answer!

IMHO and a you say, a SSE and/or AVX2 backend for SHA-256 would be great (if any is feasible) because AFAIK sha-256 is heavily used and SHA-NI is not widely available right now

@tarcieri
Copy link
Member

As a general note: unless SHA-NI is employed, SHA-512 is somewhat counterintuitively expected to be faster than SHA-256 on 64-bit CPUs.

@newpavlov
Copy link
Member

Going to close this issue in favor of #327.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants