New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aes-gcm: performance is worse than OpenSSL #243
Comments
Presently you need to enable RUSTFLAGS as described here for optimum performance: https://docs.rs/aes-gcm/0.8.0/aes_gcm/#performance-notes We are working on and have partially implemented autodetection support for these CPU features which will eliminate the need to manually configure RUSTFLAGS and will be available in the next release. |
Well, it was built with RUSTFLAGS. Surprisingly the performance is approximately 50% in encryption and 30% in decryption compared to OpenSSL. |
I'm not sure that much of a difference deserves the qualifier "much". We've presently been working on features like CPU feature autodetection (which are important) and haven't heavily invested in micro-optimization. OpenSSL uses heavily optimized hand-written assembly implementations (in the case of AES-GCM, written by cryptography engineers at Intel), so reaching performance parity with those (especially in pure Rust) will be difficult. |
If anyone would like to work on improving AES-GCM performance, #74 might be a good start |
Also note: for optimum performance, pass This will significantly improve performance on Skylake, where LLVM will use the VPCLMULQDQ instruction for GHASH. |
In my experience |
I didn't see any statistically significant difference on iMac 2019, thanks anyway :)
|
The maybe you can share your bench code. My bench code: LuoZijun/crypto-bench Bench Result: X86-64:
AArch64:
|
In #243 (comment), 16B data is used for AES-GCM tests. I bumped the data size to 8 KiB, updated all crates to the latest version, and reran some of the tests. On i5-7400 (avx2):
On Intel(R) Xeon(R) Platinum 8272CL (avx512 w/o vaes, vpclmulqdq):
|
Hello everyone! The benchmarks focus on encryption and are making use of Ideally GCM should take ~0.64 cpb on modern hardware, so similar to CTR mode. (source) All benchmarks are executed on an Intel I7 8700k with turboboost disabled and a core clock of 3.7GHz. The command used to compile and execute the code is (using This is the
All code can be found here https://github.com/Schmid7k/RustCrypto-AES-Benchmarks And here are the benchmarks. AES-GCM benchmark:
AES-CTR benchmark:
AES-CBC benchmark:
|
@Schmid7k we already have |
@Schmid7k Also, curiously enough, AES-GCM should improve significantly when RustCrypto/traits#965 will land. |
@tarcieri Oh yeah I think I missed that. @newpavlov Actually in my case it improves cpb by 0.1 - 0.2. I already tried all combinations of turning options on and off and what I have right now gives me the best performance overall. Ahh I see, then I will look out for that! |
After RustCrypto/traits#965 lands I can try implementing #74 again. If the code optimizes correctly it should double the performance. Also now that inline ASM is stable, we can add an |
I found out another interesting thing. Using nightly-2022-01-01-x86_64-unknown-linux-gnu as compiler actually improves the performance of AES-GCM on my machine compared to using the latest nightly compiler. |
@newpavlov I just noticed that you mentioned a cpb measurement of 0.49 vs 0.57 in your comment. Is this for aes-gcm or some other mode? |
@Schmid7k |
I see, alright then. Btw I don't know if this is interesting to you but I found out that the performance between specifying |
IIUC It also possible that for some reason This is why I generally prefer to not rely on |
I understand, thanks for the insights! |
hey Rustycrypto, I think OpenSSL Performance is an unfair comparison; as @tarcieri noted earlier in this thread OpenSSL has a dedicated person writing hand crafted assembly for different instruction sets. With Perl scripts to take away the pain of updating to CPU specific feature novelties, variations and new models. OpenSSL is now a fairly well funded project for FOSS standards. That person actually fixes more bugs in OpenSSL than he ever introduced as well. So is it a good idea to do the same with an I had more to say but GitHub swallowed my original comment draft so that's it for now. PS: I don't see OCB anywhere :P Happy hacking, |
Are there any plans on improving performance? It's not only slow when compared to |
I think we're bottlenecked on the trait design of Without that we can't take advantage of pipelining between AES-NI and (P)CLMUL(QDQ), which would give us an expected 2X speedup, as it were. I had an issue for that here, which we should probably reopen: See also: #74 As I mentioned before in this issue, we could also include inline ASM implementations for certain platforms, gated under an |
Another option would be to add architecture-specific low-level APIs to crates like If we can get things performing well that way, I think it could help inform the overall trait design for RustCrypto/traits#444. |
As my test via
cargo bench
, theaes-gcm-256
's performance is much worse:It was built with
export RUSTFLAGS="-Ctarget-cpu=sandybridge -Ctarget-feature=+aes,+sse2,+sse4.1,+ssse3"
as documented.For OpenSSL:
Environment:
iMac (Retina 5K, 27-inch, 2019), 3.7 GHz 6-Core Intel Core i5
The text was updated successfully, but these errors were encountered: