Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support target_arch spirv #56

Merged
merged 8 commits into from Jul 17, 2022
Merged

Support target_arch spirv #56

merged 8 commits into from Jul 17, 2022

Conversation

charles-r-earp
Copy link
Contributor

This PR makes f16 and bf16 usable for the spirv target, when using rust-gpu.

The rust-gpu compiler rustc_codegen_spirv compiles most of the core lib, but is extra picky about pointer casts, so operations on str often won't compile. So for target_arch = "spirv", I disabled impls for FromStr, Debug, Display, LowerExp, UpperExp, Binary, Octal, LowerHex, and UpperHex.

Additionally, the leading_zeros method which is used for conversions requires intrinsics on spirv, so I added a fallback in case those intrinsics are not available. I also added a test to validate this algorithm.

Usage

I'm using this in autograph but until now have been borrowing some conversion methods to operate on a pack of 2 bf16 values. But I figured out how to do 16 bit operations with the appropriate hw capabilities, which is much more natural and allows for generic functions.

Related

This could be useful to the Rust-CUDA project as well, which offers a similar operation as rust-gpu, for targeting cuda / nvptx. It would be interesting to see if f16 / bf16 work with or without these changes.

@starkat99
Copy link
Owner

Very nice, I'd been wanting to get spirv support when I had the time to look into it. I looked it over and don't see any issues with merging once the CI checks pass, looks like it needs a rustfmt

@starkat99 starkat99 merged commit e231659 into starkat99:main Jul 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants