-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce the wasmtime-fuzzing
crate
#619
Introduce the wasmtime-fuzzing
crate
#619
Conversation
This crate is intended to hold all of our various test case generators and oracles. The fuzz targets we have at `wasmtime/fuzz/fuzz_targets/*` will eventually be ~one-liner glue code calling into this crate. Part of bytecodealliance#611
60b39ac
to
7faec9b
Compare
Hmmmm... I'm getting this error in CI, but not locally:
|
We are using 2018 edition.
7faec9b
to
a26103b
Compare
It looks like binaryen-sys is failing to build on Windows. For the nightly failure did you try The release builds are also failing on Linux/Windows due to a binaryen build failure I believe. |
Oh and r=me on the code changes here :) |
Yeah, I tried |
This contains an update of the `bindgen` build dependency, to a new version that compiles on nightly, unlike the old version.
8691caa
to
4df8613
Compare
U: Unstructured + ?Sized, | ||
{ | ||
let seed: Vec<u8> = Arbitrary::arbitrary(input)?; | ||
let module = binaryen::tools::translate_to_fuzz_mvp(&seed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I'm now realizing and have wondered for awhile, I'm not actually sure if it's useful to hook up wasm-opt
in this way to libfuzzer, maybe? Usage of libFuzzer
I thought attempt to track variations in the input with variations in control flow, but in this case if we fork out to external C++ or a different process there's no way for libFuzzer
to track how the input affects the output.
If that's the case we're basically just using a bunch of libfuzzer infrastructure that's largely overkill since it's just a glorified RNG.
Even though if we used libfuzzer to produce wasm binaries directly I've wondered if it's all that useful. Most wasm binary encodings have things like section lengths and/or length headers, so it seems like it'd be very difficult for a fuzzer to intelligently produce randomly valid wasm modules.
I'm not really sure there's any way to fix this per se, other than using the input byte-by-byte and writing our own -ttf
pass in Rust itself, so libfuzzer can see how the input is affecting the branches of the wasm generator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so libfuzzer can see how the input is affecting the branches of the wasm generator.
I don't think that the important code to cover is the generator's code, its the code covered in the system under test that is important. Yes, there is probably a correlation between new paths in the generator and new paths in the system under test, but I think we are fine here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Er well I guess my point is that I think libfuzzer is overkill and not needed here. I don't think it's actually buying us anything over "do this in a loop 100 times".
My experience with libfuzzer is that it's nigh impossible to debug a failed test case on CI because you don't actually have what's necessary to reproduce it locally. If we wrote our own "just run this in a loop", though, then that would be much easier to print out failing test cases on CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Er well I guess my point is that I think libfuzzer is overkill and not needed here. I don't think it's actually buying us anything over "do this in a loop 100 times".
I've personally found that it can do significantly better than a purely random set of inputs.
My experience with libfuzzer is that it's nigh impossible to debug a failed test case on CI because you don't actually have what's necessary to reproduce it locally.
This is a totally valid concern and dev experience is super important here. libfuzzer does require writing some infrastructure yourself to get the failing test cases in a nice way, however that is true if you were to write the "do this in a loop 100 times" code too (especially in the presence of unexpected panics/segfaults compared to the property test style where some predicate returns false), and this PR is the start of filling out the code (in a way that is largely independent of libFuzzer too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say that libfuzzer can perform better, I would believe that, but I don't believe it's true in this specific case. True randomization is pretty unhelpful with fuzzing, but using wasm-opt
I think is basically true randomization which isn't really all that helpful. I'm led to believe that the usefulness of libfuzzer is that the engine can track changes in the input to changes in the execution code path of the code itself. When using wasm-opt -ttf
, it's impossible for the engine to make this connection, so it's a glorified RNG.
Or do you meant that for this PR specifically you've seen improvements with libfuzzer vs a loop?
I agree that panics/segfaults are hard to capture, but libfuzzer in my experience, especially when combined with wasm-opt
, is basically useless. I've never been able to take a libfuzzer-generated wasm file that wasm-opt
generate and somehow get back the wasm causing the issue.
I don't want to really put the brakes on this PR though. I mostly want to point out that I really believe that libfuzzer is buying us nothing if we use wasm-opt -ttf
, but libfuzzer can of course buy us quite a lot for other fuzzing strategies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've written about combining structure-aware generators with coverage-guided fuzzers before, and I think I've articulated it better there than I am doing here. It isn't a new thing, for example JQF combines quickcheck-style property testing with coverage-guided fuzzing, similar to what is being done here.
True randomization is pretty unhelpful with fuzzing, but using
wasm-opt
I think is basically true randomization which isn't really all that helpful.
When you say "true randomization" do you mean in contrast to coverage-guided fuzzing?
Sourcing wasm-opt -ttf
's input from libFuzzer allows libFuzzer to effectively "drive" the generator. This is because the input given to wasm-opt -ttf
isn't a seed for an RNG (in which case this would be "true randomization") it is a stream of choices that guide a path through a decision tree. Does that make sense?
I agree that panics/segfaults are hard to capture, but libfuzzer in my experience, especially when combined with
wasm-opt
, is basically useless. I've never been able to take a libfuzzer-generated wasm file thatwasm-opt
generate and somehow get back the wasm causing the issue.
With Walrus, I've had good experiences getting reproducible test cases out. As mentioned upthread, it did require writing some infrastructure, but that is true regardless if we use libFuzzer, or AFL, or our own custom thing, or...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say "true randomization" do you mean in contrast to coverage-guided fuzzing?
This is similar to...
This is because the input given to wasm-opt -ttf isn't a seed for an RNG (in which case this would be "true randomization")
I originally thought wasm-opt
literally did seed an RNG with the input since the input definitely isn't considered as a wasm module. You can run wasm-opt -ttf
with any input, but it turns out the input stream is just a stream of bytes which pops out in calls and is presumably interpreted as "what should I do next".
So this isn't quite what I thought was a true RNG, but rather much closer to what I thought fuzzers do in terms of very small perturbations of the input in theory have very small impacts in how they tickle the input engine.
I think it's still the case that libfuzzer generally tries to connect perturbations in the input to changes in the execution, which it can't do here since the input has no connection to the output libfuzzer can see (through an external process). That being said the subtle differences probably means that it's "good enough" and I was largely just mistaken at how wasm-opt
was implemented.
I do think though it'd be good to add an example here to print out a failing test case, since IIRC libfuzzer doesn't print out like the stdout/stderr of the program, and if this fails on CI we'd ideally want to see the *.wat
somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, cool, it seems like we are starting to come to consensus.
I do think though it'd be good to add an example here to print out a failing test case, since IIRC libfuzzer doesn't print out like the stdout/stderr of the program, and if this fails on CI we'd ideally want to see the
*.wat
somehow.
Yeah totally, this is high on my TODO list, and is going to happen in follow ups very soon.
This PR specifically is just a refactoring of our existing fuzzing set up, to separate oracles and generators, and it doesn't add/remove/change any behavior or features.
4b95e80
to
eab0ff4
Compare
eab0ff4
to
2274135
Compare
CI appears to be hitting #633 |
Everything is green except that unrelated |
…ytecodealliance#619) * Clarify Cranelift's design with respect to mid-level optimization. Cranelift doesn't currently do much mid-level optimization, however it is something we're thinking about, so remove text describing it as out of scope, and add more text explaining the vision for how it would fit into the overall system.
This crate provides test case generators and oracles for use with fuzzing.
These generators and oracles are generally independent of the fuzzing engine that might be using them and driving the whole fuzzing process (e.g. libFuzzer or AFL). As such, this crate does not contain any actual fuzz targets itself. Those are generally just a couple lines of glue code that plug raw input from (for example)
libFuzzer
into a generator, and then run one or more oracles on the generated test case.Right now, this is just a refactoring of our existing fuzz targets, and separating generators from oracles.
Part of #611