[webgpu] update perf test #6438

haoyunfeix · 2022-05-23T05:00:00Z

To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.

This change is

haoyunfeix · 2022-09-30T01:30:28Z

gyagp

LGTM

xhcao

Another comment is that why there are some many variables and all these ones are defined in functions. Could we define some of them in the global scope or wrap some variables together to reduce some definations.

xhcao · 2022-09-30T06:27:46Z

tfjs-backend-webgpu/perf/tune.js

+        const m = tf.matMul(tensorsWarmUp.tensorA, tensorsWarmUp.tensorB);
+        result.dispose();
+        m.dispose();
+      }


In tf.profile,

I think we mainly profile const result = tf.matMul(tensors[i].tensorA, tensors[i].tensorB), would including warmup model and warmup gpu parts influence the correctness?

Why we still need to execute tf.matMul(tensorsWarmUp.tensorA, tensorsWarmUp.tensorB) in the 3rd for-loop, I think the cost time of it may be larger than tf.matMul(tensors[i].tensorA, tensors[i].tensorB), which is the real main workload.

How to make sure that all matMul commands are executed one by one? I think they may be executed synchronously here?

Yes, warmup model is for the first run of each matmul shape(costing large shader generate and buffer creating time ... which leads to a significant deviation from the average value) and warmup gpu is to make every shape of inputs could be measured in the same frequency.

The idea to insert a large warmup tensor between each main workload is also to keep high frequency, the real main workload will pull down the frequency if they execute together, please see below pictures:
with: keep frequency at almost 1200M

without: Visible jitter, break the measurement accurate (only happens on WebGPU)

You're right, serial execution would be considered in another PR.

xhcao · 2022-09-30T06:30:40Z

tfjs-backend-webgpu/perf/tune.js

+  for (let i = 0; i < inputs.length; i++) {
+    let dimAOuter = parseInt(inputs[i].split(',')[0]);
+    let dimInner = parseInt(inputs[i].split(',')[1]);
+    let dimBOuter = parseInt(inputs[i].split(',')[2]);


Why we wrap the dimAOuter | dimInner | dimBOuter into a string here, and should encode and decode the string? Why not wrap this value into an array or a structure?

Each row has a rerun button to help rerun a single case for double check. And this button catches shape infos(e.g. [16, 512, 16]) from HTML element, which value must be a string. So here I used the raw data what I get from the page.

qjia7

LGTM, thanks.

As we synced offline, we can do more refactor based on it.

haoyunfeix force-pushed the webgpu_update_perf_test branch 2 times, most recently from 05264a8 to 2d2c688 Compare September 30, 2022 01:29

haoyunfeix marked this pull request as ready for review September 30, 2022 01:29

qjia7 requested review from xhcao and gyagp September 30, 2022 04:41

gyagp approved these changes Sep 30, 2022

View reviewed changes

xhcao reviewed Sep 30, 2022

View reviewed changes

haoyunfeix added 6 commits October 11, 2022 15:49

[webgpu] update perf test

8fa94e0

save

2604a93

save

3d6e2c9

save

bdb2f26

update

22be336

Move warmup outside of profile() and add comments

b035bec

haoyunfeix force-pushed the webgpu_update_perf_test branch from 07bc3bc to b035bec Compare October 11, 2022 07:49

haoyunfeix added 2 commits October 12, 2022 12:45

Added webgl used program names and MobileNetV3 as default workloads

a9fd282

Merge branch 'master' into webgpu_update_perf_test

ec5f87a

qjia7 approved these changes Oct 12, 2022

View reviewed changes

qjia7 merged commit 6391b20 into tensorflow:master Oct 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[webgpu] update perf test #6438

[webgpu] update perf test #6438

haoyunfeix commented May 23, 2022 •

edited by nsthorat

haoyunfeix commented Sep 30, 2022

gyagp left a comment

xhcao left a comment

xhcao Sep 30, 2022

haoyunfeix Oct 11, 2022

xhcao Sep 30, 2022

haoyunfeix Oct 11, 2022

qjia7 left a comment

[webgpu] update perf test #6438

[webgpu] update perf test #6438

Conversation

haoyunfeix commented May 23, 2022 • edited by nsthorat

haoyunfeix commented Sep 30, 2022

gyagp left a comment

Choose a reason for hiding this comment

xhcao left a comment

Choose a reason for hiding this comment

xhcao Sep 30, 2022

Choose a reason for hiding this comment

haoyunfeix Oct 11, 2022

Choose a reason for hiding this comment

xhcao Sep 30, 2022

Choose a reason for hiding this comment

haoyunfeix Oct 11, 2022

Choose a reason for hiding this comment

qjia7 left a comment

Choose a reason for hiding this comment

haoyunfeix commented May 23, 2022 •

edited by nsthorat