Replies: 1 comment
-
Ahh, it seems to work on GPU, so it could be that |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi there,
I'm have a custom model that uses a TransformerDecoder module and a boolean causal mask, as follows:
This works fine on both CPU and GPU, however, I'm trying to use mixed precision for training, as follows:
and I'm running into the following error:
RuntimeError: Expected attn_mask dtype to be bool or to match query dtype, but got attn_mask.dtype: float and query.dtype: c10::BFloat16 instead.
.As can be seen in the code above, I've tried casting the mask to
bool
everywhere, as I think that's compatible withbfloat16
(I may be mistaken).I've also tried using:
but getting the same error.
Any ideas appreciated, thanks!
Beta Was this translation helpful? Give feedback.
All reactions