Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Construct and use a 'fuzzing dictionary' #8

Open
Zac-HD opened this issue Oct 26, 2022 · 0 comments
Open

Construct and use a 'fuzzing dictionary' #8

Zac-HD opened this issue Oct 26, 2022 · 0 comments

Comments

@Zac-HD
Copy link
Owner

Zac-HD commented Oct 26, 2022

In fuzzing, a "dictionary" is a corpus of known-interesting fragments (boundary values, html tags, etc.) that can be mixed in with randomly-generated or mutated data to increase the chance of stumbling across interesting bugs.

We kinda support doing this with Hypothesis for some types already; it's how we boost the chances of boundary integers and "interesting" floats. However there's not currently any mechanism for adding to the pool at runtime, and adding one will take some care to ensure that we can still replay failing examples without that runtime pool. See also HypothesisWorks/hypothesis#3086 and HypothesisWorks/hypothesis#3127 (comment).

Once we've got that, the standard easy way to get a dictionary is to run strings on your binary. The natural equivalent is to grab our Python source code and collect all the ast.Constant values! (excluding perhaps long strings, which are likely docstrings)

A more advanced trick, shading into full research project, would be to investigate Redqueen-style tracking. For example, "a string in the input matched against this regex pattern in the code, so try generating strings matching that pattern".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant