New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow not copying the wasm binary into the module when using WASM C API #3389
Allow not copying the wasm binary into the module when using WASM C API #3389
Conversation
not very sure about it. the binary content(in wasm_module_t) will be used during execution. It's on purpose to release the binary when destroying store. |
I think we can have a try, it really increases the memory usage if wasm-c-api clones another copy, and if it doesn't do that, the content of input buffer from developer may be modified and the buffer will be referred by runtime after loading, the difference is that developer can not use the input buffer to new a module again. |
Is that the case? When opening this PR, I had it tested only in interpreter mode and I thought it wouldn't work with AOT (that's why I created it as a draft and didn't bother fixing the CI errors). |
d1bb328
to
293cfcd
Compare
if this means IIUC, the big concern is function bytecodes. Classic-interp always needs the binary content. fast-interp doesn't because of recompilation. xxx_jit needs the content only during bytecodes to IRs translation. aot is there ⬆️. XIP always depends on the binary content. Then, next problem will be const strings. Luck for us, some function, like |
iirc, it isn't by luck. |
Done that, it works fine. I wasn't precise there, you can't release right after module instantiation, you have to wait until wasm execution (e.g.
Just tried with classic interpreter (I had only tried fast interpreter and AOT successfully) and as you say this PR doesn't work there.
What about those functions? Why do they need special treatment? During testing I didn't run into any problems with those. So changes in this PR only work for fast interpreter and AOT (without XIP). Possibly JIT too, but I haven't tested it yet. |
This PR is not about stream-loading. It tries to save memory when using WAMR by allowing to release the wasm binary buffer before starting the wasm execution. |
293cfcd
to
627cff4
Compare
the "make a copy instead of keeping references" aspect is basically same. |
iirc, one of implications of is_load_from_file_buf=false is to make a copy. i guess you can somehow use it. |
I see that a copy is done anyway regardless of that flag
That flag seems to be used to shift the string
But again, if we avoid freeing the module buffer until the wasm execution that shouldn't be needed. |
627cff4
to
7971818
Compare
in the latter case, the user-given buffer is used.
i don't understand what you mean. can you explain a bit? |
I was wrong, I wasn't considering cases like this one #3389 (comment) |
24cfd0b
to
56a4f5d
Compare
56a4f5d
to
20416a9
Compare
doc/memory_tune.md
Outdated
@@ -30,3 +30,4 @@ Normally there are some methods to tune the memory usage: | |||
- set the app heap size with `wasm_runtime_instantiate` | |||
- use `nostdlib` mode, add `-Wl,--strip-all`: refer to [How to reduce the footprint](./build_wasm_app.md#2-how-to-reduce-the-footprint) of building wasm app for more details | |||
- use XIP mode, refer to [WAMR XIP (Execution In Place) feature introduction](./xip.md) for more details | |||
- when using the Wasm C API, set `clone_wasm_binary=false` in `LoadArgs` and free the wasm binary buffer (with `wasm_byte_vec_delete`) after module loading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed with @lum1n0us, per our understanding, currently it is only available for fast interpreter mode and AOT mode. For classic interpreter and all JIT modes, the bytecode is used by the interpreter or by the JIT compiler after loading, we may try to clone the bytecode to fix it, but we can do it in another PR.
So could you mention this here to avoid misunderstanding?
And could you also add another item for wasm/AOT loader, since now we can set wasm_binary_freeable=true
in LoadArgs
and free the wasm binary buffer after module loading. And similar, it's only available for fast interpreter mode and AOT mode now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed with @lum1n0us, per our understanding, currently it is only available for fast interpreter mode and AOT mode
What is supposed to break with classic interpreter? Because I gave it a quick try and seemed to run fine.
Anyway, I updated the the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bytecodes in code section is used in classic interpreter after loading:
https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/core/iwasm/interpreter/wasm_loader.c#L3700-L3710
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3389 (comment) So, yes, I actually free the wasm binary buffer after module instantiation, not loading. In that way, it works with classic interpreter too.
In my case it doesn't make much of a difference if the binary is freed after loading or after instantiation. The main goal is to free it before wasm execution, because the execution will start using additional memory and we don't want to overhead of the binary buffer.
20416a9
to
5abaac0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please correct me if I am wrong. Your examples will free/release/pollute binary content after module loading (before instantiation). And it works well in classic-interp, fast-interp, jit and aot. |
my impression is that this requires users too much implementation knowledge. how about providing an api to query if it's safe to free the buffer? eg. bool has_reference_to_underlying_byte_vec(const wasm_module_t *module); |
The wasm-c-api wasm_module_ex_t's So, per my understanding, developer can use the if (wasm_module_is_underlying_binary_cloned(module)) { //or if (wasm_module_is_underlying_binary_freeable(module))
wasm_byte_vec_delete(binary);
wasm_runtime_free(binary);
} for wasm/aot loader, his code is somewhat like:
@eloparco @yamt not sure whether that is ok for you? Or how about providing |
Do we need two separate APIs? One (i.e. And
It's not module type but rather execution mode. I wouldn't add a check on that but rather write in the documentation that, with fast interpreter and AOT, the module can be freed after loading; while, in classic interpreter mode, after instantiation. |
why not? |
i believe you can't free the buffer for classic interpreter. |
Yeah, I mean one API for wasm-c-api and one for wamr runtime api, and I found you had done that in the PR, it LGTM. |
Agree, how about we merge this PR first and then submit another PR to clone the wasm file's bytecode for classic interpreter mode to avoid referring to the underlying buffer? |
i'm not sure if it's a good idea. i feel it's simpler to make wasm_runtime_is_underlying_binary_freeable return false for now, until someone evaluates pros/cons. |
OK, cloning bytecode for classic interpreter may save the memory of data segments, but as you said, we can just let |
0262d6c
to
d5595db
Compare
d5595db
to
af265e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except two minor comments.
@@ -1400,6 +1406,7 @@ wasm_runtime_load_from_sections(WASMSection *section_list, bool is_aot, | |||
LOG_DEBUG("WASM module load failed from sections"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we had better also set args.is_binary_freeable = false
in wasm_runtime_load, see L1385, though LoadArgs args = { 0 };
already set it to false?
Actually, while I was able to free the buffer after loading in the simple example in this PR, I see problems when doing it with a more complex example. During module instantiation, this part is causing problems
Everything is fine if I free the buffer after module instantiation instead. [EDIT]: Actually I can see the problem already with the sample in this example, it outputs: calling into WASM function: mul7, mul7 return 21
0x15:i32 when turning fast interpreter on. Let's agree on that and I'll modify the PR accordingly. I think it makes sense to have this PR cover the case of free after instantiation and if we want to investigate if it's possible to free after loading we can do it separately. |
Hi, I checked the source code again, there are really several places in wasm/aot module which refer to the input buffer even when we set
It may be used in wasm instantiation, that's why your case works if we free buffer after instantiation but doesn't work before it. But the issue is that the passive data segment may be also used in opcode memory.init:
And note that the issue only exists in interpreter mode since the aot loader will allocate new memory for the data segment: wasm-micro-runtime/core/iwasm/aot/aot_loader.c Lines 1002 to 1017 in 8f098a5
wasm-micro-runtime/core/iwasm/interpreter/wasm_loader.c Lines 5121 to 5124 in 8f098a5
wasm-micro-runtime/core/iwasm/aot/aot_loader.c Lines 931 to 934 in 8f098a5
It is mainly used by wamrc and the API wasm_runtime_get_custom_section exported in wasm_export.h I am not sure how to handle them, for 1, there may be two options: for 2, maybe we can check whether for 3, maybe we can ignore it, and mention in the document that after the underlying binary buffer is freed, developer cannot call wasm_runtime_get_custom_section again. |
Thanks for spotting that
I'll try to update the code to use (1) and see how it goes. How does (2) fix the problem with memory.init? Memory.init would be called during wasm execution and we won't be able to free the wasm binary buffer before that.
And what do we do if it's NULL?
Yes, makes sense to me |
No better way to fix (2), what we can do is to assume that Agree to use (1) to fully resolve the issue, and we may also free the dropped data segments after they are dropped - not in this PR, we can submit another PR to refine it.
|
Right, done that now, but I needed to change the API to accept the instance instead of the module as an argument. |
d2d0deb
to
a8c3605
Compare
*/ | ||
WASM_RUNTIME_API_EXTERN bool | ||
wasm_runtime_is_underlying_binary_freeable( | ||
const wasm_module_inst_t module_inst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes that the binary can be freed only after instantiation, is it better to clone data segment in wasm module and clone string_literal_ptrs in both wasm and aot modules when wasm_binary_freeable is true? So that we can free the input buffer after loading. If it is a little complex, we can help do it after this PR is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we would need to clone the data_dropped
bitmap into the module to avoid using the instance. But, as you say, we can do that in a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be to clone the data segments in wasm file but not to clone the data_dropped bitmap. Anyway, let's do it in another PR.
ca604e6
to
afaef9b
Compare
afaef9b
to
7e94c60
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
*/ | ||
WASM_RUNTIME_API_EXTERN bool | ||
wasm_runtime_is_underlying_binary_freeable( | ||
const wasm_module_inst_t module_inst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be to clone the data segments in wasm file but not to clone the data_dropped bitmap. Anyway, let's do it in another PR.
The
wasm_module_new
function copies the WASM binary passed as an argument into the module.This PR allows passing an additional flag to
wasm_module_new_ex
, to avoid that copy. In that way, the binary can be manually released after module loading (instead of having to wait for the store to be destroyed).