New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve compilation and linking of Rust files #11102
Conversation
I will be away next week, but feel free to comment anyway |
CI state |
CI state |
0be5992
to
b07eb82
Compare
@@ -1406,7 +1402,7 @@ def clang_inline_threshold(): | |||
for mode in modes: | |||
modes[mode]['cxxflags'] += f' -Wstack-usage={modes[mode]["stack-usage-threshold"]} -Wno-error=stack-usage=' | |||
|
|||
has_wasmtime = os.path.isfile('/usr/lib64/libwasmtime.a') and os.path.isdir('/usr/local/include/wasmtime') | |||
has_wasmtime = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because linking with both libwasmtime
and librust_combined
was causing duplicate symbol errors. This will not be an issue anymore when we add wasmtime in rust, and that patch is only waiting for #10306, so this temporary fix should not last a long time (wasm is also still experimental)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this "fix" means that wasm will no longer be working at all, until a later fix? @psarna are we fine with that?
I see 10306 has already been merged, so can't you fix that wasm issue right now and not have any temporary period where it doesn't work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed that one. No, it's not acceptable, we have experimental support, but we also already published a few examples of it online and we expect people to try it out, not to find out they're missing in a particular commit. What should be done here is not linking with librust_combined now if it's useless and doesn't provide anything. Or perhaps it's enough to mark it no-std? ref: https://docs.rust-embedded.org/book/intro/no-std.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After trying a couple of methods, it seems that to avoid the linking issues we would have to at least modify our dependencies (in combination with using no-std).
We can't avoid linking with librust_combined as long as we have the test/boost/rust_test
.
We can't merge the mentioned rust wasmtime patch before this one, because similarly, we'll have linking issues.
Having that in mind, I think it would be best if we combined these 2 patches in one PR. I'll soon publish the new PR depending on this one anyway, but recently we decided that there's something I still need to add to it, so you'll have to wait a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also fine with temporarily dropping rust_test if it helps, we can bring it back once we have wasmtime, or just drop it in favor of some custom wasmtime tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you commented this before yesterday's patch. Would you still like rust_test dropped or will it be fine if we just combine this series with the wasmtime one and try to merge them at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if dropping the test means that we can cheaply avoid disabling wasm-based udfs for a while (even within the same series), then I think it's worth it. But if it's in any way not trivial, the current state is acceptable too, because wasm support is experimental anyway.
I'm back from vacations so please review when you have time |
.gitignore
Outdated
@@ -31,3 +31,4 @@ docs/poetry.lock | |||
compile_commands.json | |||
.ccls-cache/ | |||
.mypy_cache | |||
rust/Cargo.lock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should check in Cargo.lock into the Scylla repository. It specifies exact versions of dependencies, so checking it in will help achieve more reproducible builds.
See: https://doc.rust-lang.org/cargo/faq.html#why-do-binaries-have-cargolock-in-version-control-but-not-libraries . In our case, the resulting static library is a binary in some sense, and it is "at the very end of the dependency chain", so I believe the rule for binaries applies here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did base my decision to put the Cargo.lock here on the same article, but I see that your interpretation makes more sense.
Tracking the Cargo.lock will however require an additional step of regenerating it after any dependency changes, so I'll mention it in the docs
rust/Cargo.toml
Outdated
[profile.rust-debug] | ||
inherits = "dev" | ||
opt-level = 1 | ||
overflow-checks = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't overflow checks be enabled in debug?
rust/Cargo.toml
Outdated
|
||
[profile.rust-sanitize] | ||
inherits = "dev" | ||
opt-level = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sanitize
mode compiles cpp files using -Os
, so maybe 's' would be more appropriate here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect that both clang and rustc use similar optimization passes for the s
optimization level considering they both use llvm, but I didn't find any direct comparison that confirms it. There is a slight difference in how they are described in their official documentations though: clang describes -Os
as "Like -O2
with extra optimizations to reduce code size", while rust describes it as just: "optimize for binary size".
The description of the Sanitize
mode provided by scylla is "a mode with optimizations and sanitizers enabled, used for finding memory errors", so I used the mode that is confirmed to at least have some optimizations.
On the other hand, I haven't confirmed that the -O2
mode in rust is that similar to clang's -O2
, so for the sake of consistency, maybe the s
option would be best after all
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the main purpose of -Os
is to "optimize for size", so IMO we should use s
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
CI state |
57bc512
to
dc80b61
Compare
CI state |
I don't understand any of this. We have a build running in install-dependencies.sh, why do we need to run it again? |
Maybe the naming here is misleading. This patch affects the compilation of source files in the |
I noticed cxxbridge allows to remove some code duplication using |
CI state |
Please try to clarify the wording then so I can understand it. |
@@ -1406,7 +1402,7 @@ def clang_inline_threshold(): | |||
for mode in modes: | |||
modes[mode]['cxxflags'] += f' -Wstack-usage={modes[mode]["stack-usage-threshold"]} -Wno-error=stack-usage=' | |||
|
|||
has_wasmtime = os.path.isfile('/usr/lib64/libwasmtime.a') and os.path.isdir('/usr/local/include/wasmtime') | |||
has_wasmtime = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this "fix" means that wasm will no longer be working at all, until a later fix? @psarna are we fine with that?
I see 10306 has already been merged, so can't you fix that wasm issue right now and not have any temporary period where it doesn't work?
|
||
[profile.rust-sanitize] | ||
inherits = "dev" | ||
opt-level = "s" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "opt-level=s" do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does "sanitize" mode specifically needs to optimize for size? I'm not saying it's bad, it just sounds strange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cpp files are compiled with the -Os
flag in the sanitize mode, so I suggested using an equivalent flag for rust files.
As for why we are optimizing for size in the cpp files - I'm not sure. The original patchset which introduces sanitize mode (https://groups.google.com/g/scylladb-dev/c/wQ_pw5zkJIg/m/bHW-oeO3AwAJ) doesn't seem to explain it, and the author doesn't work with us anymore. If you have a good argument not to optimize for size here, then I'm completely fine with using a different mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'm fine with this optimization mode. Any other choice will seem equality arbitrary...
@@ -1406,7 +1402,7 @@ def clang_inline_threshold(): | |||
for mode in modes: | |||
modes[mode]['cxxflags'] += f' -Wstack-usage={modes[mode]["stack-usage-threshold"]} -Wno-error=stack-usage=' | |||
|
|||
has_wasmtime = os.path.isfile('/usr/lib64/libwasmtime.a') and os.path.isdir('/usr/local/include/wasmtime') | |||
has_wasmtime = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed that one. No, it's not acceptable, we have experimental support, but we also already published a few examples of it online and we expect people to try it out, not to find out they're missing in a particular commit. What should be done here is not linking with librust_combined now if it's useless and doesn't provide anything. Or perhaps it's enough to mark it no-std? ref: https://docs.rust-embedded.org/book/intro/no-std.html
CI state |
CI state |
In the last rebases I modified the configure script: because we're creating one library for all rust sources together, I added a list of all rust files that are required for the lib. The targets that require rust sources should now include the related |
This patch is now also a part of #11351 so if it makes more sense to comment with the context of further changes, it can be done there |
CI state |
Looks reasonable, will defer to @psarna decision. |
Currently, the rust build system in Scylla creates a separate static library for each incuded rust package. This could cause duplicate symbol issues when linking against multiple libraries compiled from rust. This issue is fixed in this pach by creating a single static library to link against, which combines all rust packages implemented in Scylla. The Cargo.lock for the combined build is now tracked, so that all users of the same scylla version also use the same versions of imported rust modules. Additionally, the rust package implementation and usage docs are modified to be compatible with the build changes. This patch also adds a new header file 'rust/cxx.hh' that contains definitions of additional rust types available in c++.
A cargo profile is created for each of build modes: dev, debug, sanitize, realease and coverage. The names of cargo profiles are prefixed by "rust-" because cargo does not allow separate "dev" and "debug" profiles. The main difference between profiles are their optimization levels, they correlate to the levels used in configure.py. The debug info is stripped only in the dev mode, and only this mode uses "incremental" compilation to speed it up.
CI state |
This series adds the implementation and usage of rust wasmtime bindings. The WASM UDFs introduced by this patch are interruptable and use memory allocated using the seastar allocator. This series includes #11102 (the first two commits) because #11102 required disabling wasm UDFs completely. This patch disables them in the middle of the series, and enables them again at the end. After this patch, `libwasmtime.a` can be removed from the toolchain. This patch also removes the workaround for ##9387 but it hasn't been tested with ARM yet - if the ARM test causes issues I'll revert this part of the change. Closes #11351 * github.com:scylladb/scylladb: build: remove references to unused c bindings of wasmtime test: assert that WASM allocations can fail without crashing wasm: limit memory allocated using mmap wasm: add configuration options for instance cache and udf execution test: check that wasmtime functions yield wasm: use the new rust bindings of wasmtime rust: add Wasmtime bindings rust: add build profiles more aligned with ninja modes rust: adjust build according to cxxbridge's recommendations tools: toolchain: dbuild: prepare for sharing cargo cache
This PR was added as a part of #11351, which just got merged. We can close this now as well |
This patch can be treated as a follow-up to #10341, or as a preparation before migrating
from the lacking C++ wasmtime bindings to our custom implementation that will use the
original Rust crate.
The main changes are the introduction of cargo profiles (equivalents of scylla build modes), fixes related to linking against multiple rust packages and corresponding adjustments in documentation.
Some details are included in the commit messages, but you can read more about using
the Rust "CXX" bridge in a foreign build system here and the available cargo profile options here