Support SIMD on Rust stable #520

koute · 2023-03-21T12:36:40Z

This PR makes the SIMD backend work on stable Rust.

The nightly-only packed_simd dependency was removed.
A small mostly API-compatible replacement for packed_simd was added.
The AVX2 backend now works on stable Rust.
The AVX512 backend still requires nightly as those intrinsics are not yet stabilized, so on stable Rust the AVX2 backend will be used even if the target CPU supports AVX512.

tarcieri · 2023-03-21T13:25:26Z

Awesome!

koute · 2023-03-24T15:18:37Z

After this PR gets in I also have another PR in queue (not yet finished, but works already) where I want to switch SIMD to be entirely autodetected at runtime so that it's not necessary to compile with target_feature anymore (most likely at little to no performance loss; in fact I sped it up a little from what I can see where e.g. the vartime fixed base multiplication bench dropped from ~4.2ms to ~4.08ms on the AVX2 backend on my machine).

tarcieri · 2023-03-24T15:22:28Z

Runtime autodetection sounds great!

jcape · 2023-03-24T15:58:25Z

Can I ask that you use either use cpufeatures (non-vendored) or allow this to be selected manually. There's no CPUID in an SGX enclave.

koute · 2023-03-24T16:01:48Z

Can I ask that you use either use cpufeatures (non-vendored) or allow this to be selected manually. There's no CPUID in an SGX enclave.

Yes, I will still keep the ability to override/disable this; I'd just like autodetection to be the default.

tarcieri · 2023-03-24T18:25:31Z

FWIW, I have a PR open to add AVX-512 support to cpufeatures: RustCrypto/utils#862

koute · 2023-03-27T10:49:50Z

(Rebased to resolve the conflict in Cargo.toml.)

OtaK · 2023-03-27T16:04:52Z

If I can interject in the conversation, I have a small question.

Knowing std::simd (and packed_simd_2) supports producing WASM SIMD instructions, I was wondering if it was possible to relax this check to allow choosing the SIMD backend when the platform_family is wasm and target_features includes simd128.

If that's a trivial change, that would allow WASM ed25519 to benefit from WASM basically for free. If it's not, then nevermind.

Edit: After looking at the PR, it seems x86_64 intrinsics are directly called so this isn't a trivial change. I guess this is for another PR?

koute · 2023-03-27T16:09:11Z

If I can interject in the conversation, I have a small question.

Knowing std::simd (and packed_simd_2) supports producing WASM SIMD instructions, I was wondering if it was possible to relax this check to allow choosing the SIMD backend when the platform_family is wasm and target_features includes simd128.

If that's a trivial change, that would allow WASM ed25519 to benefit from WASM basically for free. If it's not, then nevermind.

All of the current SIMD backends use explicit architecture-specific compiler intrinsics, so that wouldn't do anything besides break compilation when targetting WASM. The current dependency on packed_simd is not there because the code takes advantage of the "it's SIMD but it's portable" aspect, but because it defines a few wrapper types for the SIMD types which are more convenient to use.

You'd basically have to add a WASM-specific backend.

jrose-signal · 2023-03-27T17:09:05Z

It's also important that whatever operations are used remain constant-time, which needs to be verified for every platform.

rozbb

This is great, and with hardly any changes! I left some commends, mostly in the packed_simd.rs file. I'm not super familiar with these intrinsics so some questions might be obvious.

src/backend/vector/packed_simd.rs

rozbb · 2023-03-29T07:08:34Z

@tarcieri looks good to go, modulo CI and readme updates to reflect that nightly isn't needed for AVX2 anymore.

Also I was looking for a source on why the AVX2 add, sub, mul, shl, and shr functions we use are constant time, but I got nothing.

koute · 2023-03-29T07:53:53Z

looks good to go, modulo CI and readme updates to reflect that nightly isn't needed for AVX2 anymore.

Do you want me to also update those?

Also I was looking for a source on why the AVX2 add, sub, mul, shl, and shr functions we use are constant time, but I got nothing.

I can't comment on constant time-ness of the rest of the code existing code as I haven't analyzed it in that regard, but AFAIK all of these should be constant time. Although this does kind of depend on the microarchitectural details of the CPU on which we're running I think all of modern Intel and AMD CPUs should have all of these implemented as constant time,

(Also, please note that this PR doesn't really change much here; these were still used before, just indirectly through the packed_simd dependency.)

rozbb · 2023-03-29T08:01:52Z

Re docs: that'd be great actually! If you don't have time, one of us will do it, probably this week.

Re constant time: Yup, agreed, and it seems that every serious curve25519 impl uses AVX2. It would just be nice to be able to point to something that justifies our use of it.

koute · 2023-03-29T10:32:30Z

I've updated the README and I've also added a job to the CI to test this. (Hopefully it'll work?)

Initially I only added a job to build the code on Rust stable, but then I realized that these tests are not being run at all, and we should be able to run them, at least for AVX2 (which at this point is 10 years old, so it's a pretty safe bet to assume that the test runners should support it).

tarcieri · 2023-03-29T13:45:55Z

Also I was looking for a source on why the AVX2 add, sub, mul, shl, and shr functions we use are constant time, but I got nothing.

They're pretty routinely used for cryptography. I would be fairly surprised to learn they have any data dependent timing variability, which I wouldn't expect from any arithmetic instructions on modern CPUs whether they're SIMD or not.

(They can, however, have timing variability based on thermal throttling, but that's a whole different can of worms)

rozbb · 2023-03-29T17:29:29Z

@tarcieri good to merge? This might also require an MSRV bump right?

tarcieri

LGTM.

I don't think it needs an MSRV bump? AVX (<512) intrinsics have been stable for quite awhile.

Maybe there could be an MSRV test with +avx2 enabled? But really the answer there is runtime feature gating as @koute mentioned.

rozbb · 2023-03-29T21:59:10Z

Oh ok. I just assumed this wouldn't work with anything prior to the version that stabilized AVX2.

rozbb · 2023-03-30T05:48:52Z

Oh I was under the impression that stable AVX2 was new. Apparently it was stabilized in 2018. I'll try to consolidate the test to run both builds in MSRV.

Also, I just ran AVX2 tests on my x86_64 machine and everything passess.

rozbb · 2023-03-30T06:18:48Z

Thanks again @koute! This is a great addition

koute · 2023-03-30T06:25:22Z

Thanks!

koute · 2023-04-11T12:07:26Z

To anyone who was following this PR, my followup PR with runtime autodetection is now up: #523

koute added 9 commits March 27, 2023 19:44

Remove dependency on packed_simd

9312a13

Support SIMD on stable Rust

bd32f0b

Move packed_simd.rs to vector module

7343254

Add comment header to packed_simd.rs

1761028

Initialize SIMD registers using intrinsics instead of transmute

94035c5

Use a splat inside of unpack_pair

bccad56

Add #[allow(dead_code)]s

0db286b

Replace unwrap with expect in build.rs to make Clippy happy

d0a3709

Allow missing docs for the packed_simd module

a448528

koute force-pushed the main_stable_simd branch from e8eefb4 to a448528 Compare March 27, 2023 10:49

rozbb requested changes Mar 27, 2023

View reviewed changes

koute and others added 6 commits March 28, 2023 21:50

Document the PartialEq impl for the SIMD wrapper types

94aee21

Remove the IntoBits trait

7393812

Remove the i32x8 wrapper type

5f1f887

Add a wrapper method for _mm256_mul_epu32

590f20e

Add extra doc comments

70f6b69

Added docs on why signed/unsigned arithmetic conversion is OK

69dbacf

rozbb approved these changes Mar 29, 2023

View reviewed changes

koute added 3 commits March 29, 2023 19:24

Update README: the AVX2 backend now works on stable Rust

ad56e1f

Add a CI job to also build the AVX2 SIMD backend on Rust stable

6288388

Actually run the tests for the AVX2 SIMD backend on CI

f5e2d2e

tarcieri approved these changes Mar 29, 2023

View reviewed changes

rozbb added 4 commits March 30, 2023 01:22

Added SIMD MSRV test; docs

4bfe9ce

Fixed yaml

cada35f

Fixed yaml again

fa52eaf

CI wibble

07e5b8d

rozbb added 2 commits March 30, 2023 01:50

Put AVX2 build in MSRV test

60ce89a

Added comment

6b676d4

rozbb merged commit 4583c47 into dalek-cryptography:main Mar 30, 2023
14 checks passed

tarcieri mentioned this pull request Apr 2, 2023

simd_backend: replace packed_simd_2 with std::simd #415

Closed

daira mentioned this pull request Jun 9, 2023

Update ed25519-zebra dependency to 4.x zcash/zcash#6705

Merged

burdges mentioned this pull request Jun 12, 2023

approval-distribution: process assignments and votes in parallel paritytech/polkadot-sdk#732

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support SIMD on Rust stable #520

Support SIMD on Rust stable #520

koute commented Mar 21, 2023 •

edited

tarcieri commented Mar 21, 2023

koute commented Mar 24, 2023 •

edited

tarcieri commented Mar 24, 2023

jcape commented Mar 24, 2023 •

edited

koute commented Mar 24, 2023

tarcieri commented Mar 24, 2023

koute commented Mar 27, 2023

OtaK commented Mar 27, 2023 •

edited

koute commented Mar 27, 2023 •

edited

jrose-signal commented Mar 27, 2023

rozbb left a comment

rozbb commented Mar 29, 2023

koute commented Mar 29, 2023

rozbb commented Mar 29, 2023

koute commented Mar 29, 2023

tarcieri commented Mar 29, 2023

rozbb commented Mar 29, 2023

tarcieri left a comment •

edited

rozbb commented Mar 29, 2023 •

edited

rozbb commented Mar 30, 2023 •

edited

rozbb commented Mar 30, 2023

koute commented Mar 30, 2023

koute commented Apr 11, 2023

Support SIMD on Rust stable #520

Support SIMD on Rust stable #520

Conversation

koute commented Mar 21, 2023 • edited

tarcieri commented Mar 21, 2023

koute commented Mar 24, 2023 • edited

tarcieri commented Mar 24, 2023

jcape commented Mar 24, 2023 • edited

koute commented Mar 24, 2023

tarcieri commented Mar 24, 2023

koute commented Mar 27, 2023

OtaK commented Mar 27, 2023 • edited

koute commented Mar 27, 2023 • edited

jrose-signal commented Mar 27, 2023

rozbb left a comment

Choose a reason for hiding this comment

rozbb commented Mar 29, 2023

koute commented Mar 29, 2023

rozbb commented Mar 29, 2023

koute commented Mar 29, 2023

tarcieri commented Mar 29, 2023

rozbb commented Mar 29, 2023

tarcieri left a comment • edited

Choose a reason for hiding this comment

rozbb commented Mar 29, 2023 • edited

rozbb commented Mar 30, 2023 • edited

rozbb commented Mar 30, 2023

koute commented Mar 30, 2023

koute commented Apr 11, 2023

koute commented Mar 21, 2023 •

edited

koute commented Mar 24, 2023 •

edited

jcape commented Mar 24, 2023 •

edited

OtaK commented Mar 27, 2023 •

edited

koute commented Mar 27, 2023 •

edited

tarcieri left a comment •

edited

rozbb commented Mar 29, 2023 •

edited

rozbb commented Mar 30, 2023 •

edited