You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running ackermann benchmark with wasmtime, and I noticed that it had a performance delta when compared with native, of approx 30%. Profiling with VTune, I see wasmtime disassembly containing lot of setup/teardown function call stack related instructions at the beginning and end of the function, while native (clang, -O3) does not.
I used wasmtime explore to correlate the wat with disassembly as well. Here are the snippets of disassembly -
Native disassembly is pretty short, the entirety of the function is as shown below (this is in at&t syntax, unlike Intel syntax in some above snippets) -
and the C source function to generate wasm and native is -
int ackermann(int M, int N)
{
if (M == 0)
{
return N + 1;
}
if (N == 0)
{
return ackermann(M - 1, 1);
}
return ackermann(M - 1, ackermann(M, (N - 1)));
}
I also tried with --wasm-features tail-call cli flag, however that actually made the perf slightly worse.
Any pointers on the difference in disassembly between native and wasm?
The text was updated successfully, but these errors were encountered:
We have two clobber-saves (r12 and r15), whereas the native code gets away with one (rbx). It would be a good exercise to trace through the assembly and see what the registers are used for; perhaps the native compiler's register allocator is able to be a bit smarter about reuse. It is fundamentally necessary to have some state on the stack I think, since there is a recursive call (the one in non-tail position on the second-to-last line of C) and there is at least one word of state (M) necessary after it returns.
And FWIW, it is known that the tail calling convention can currently lead to some slow downs, which is why Wasm tail calls aren't enabled by default yet: #6759
I am running
ackermann
benchmark with wasmtime, and I noticed that it had a performance delta when compared with native, of approx 30%. Profiling with VTune, I see wasmtime disassembly containing lot of setup/teardown function call stack related instructions at the beginning and end of the function, while native (clang, -O3) does not.I used
wasmtime explore
to correlate thewat
with disassembly as well. Here are the snippets of disassembly -Wasm Setup of the stack -
I have pasted the wat file of this function below as well for reference.
Wasm Teardown -
wat
of relevant function -Native disassembly is pretty short, the entirety of the function is as shown below (this is in at&t syntax, unlike Intel syntax in some above snippets) -
and the C source function to generate wasm and native is -
I also tried with
--wasm-features tail-call
cli flag, however that actually made the perf slightly worse.Any pointers on the difference in disassembly between native and wasm?
The text was updated successfully, but these errors were encountered: