You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unfortunately neither u128 nor swap_bytes are supported directly by WebAssembly. So both implementations of folded_multiply are very slow.
I think an algorithm that takes both u64 values, turns them into a v128 vector and then does a bunch of swizzling and vector multiplications and co. probably would be the much faster solution. Here a Godbolt link with a little sketch:
Unfortunately neither
u128
norswap_bytes
are supported directly by WebAssembly. So both implementations offolded_multiply
are very slow.I think an algorithm that takes both u64 values, turns them into a v128 vector and then does a bunch of swizzling and vector multiplications and co. probably would be the much faster solution. Here a Godbolt link with a little sketch:
https://rust.godbolt.org/z/jGGhYjGs8
I don't have enough knowledge about how to verify the quality, so I decided to not directly open a PR and instead first discuss the feasibility.
The text was updated successfully, but these errors were encountered: