New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add int8 support in fused_multi_transformer_pass and fuse_multi_transformer_layer_pass #48209
Conversation
This reverts commit 4f068e8. revert a commit which has been merged :
你的PR提交成功,感谢你对开源项目的贡献! |
#else | ||
nullptr, | ||
hshshsh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是不是改错了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
收到 测试代码 忘记删除了
quant_last_in_scale / | ||
dequant_out_scale[it][jt]) + | ||
dequant_out_scale[it][jt] | ||
// quant_last_in_scale / |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
要不直接删了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
麻烦再提交一个PR完善报错信息
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But need to unify delete_weight_dequant_linear_op pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…former_layer_pass (PaddlePaddle#48209) * delete unnecessary shape and slice op Co-authored-by: Your Name <you@example.com>
PR types
Others
PR changes
Others
Describe
PR #45907 and PR #47541 finshed fusing fp16/32 GPT3 model to a single fused_multi_transformer op, which can speed up inference of GPT3. In this PR, we add int8 support, the operations added is as follows:
==》