New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[intel-npu] Implementing DEVICE_GOPS metric #24439
Conversation
device_gops[ov::element::f32] = 0; | ||
device_gops[ov::element::u8] = gops; | ||
device_gops[ov::element::i8] = gops; | ||
device_gops[ov::element::f16] = 0.5f * gops; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for my education, why do we multiple by 0.5 for fp16?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is based on the indications from the Arch team, i don't know the details unfortunately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ticket in description if you don't mind
I still need to add a cross-OS backwards compatibility check, we are currently working on getting alined-with with the driver teams, as older drivers don't report perf/slice in physicalEUSimdWidth, leading to incorrect numbers. |
c8fbf89
to
7582072
Compare
Details:
Tickets: