Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Hypothesis plugin #2097

Merged
merged 4 commits into from Feb 11, 2021
Merged

Conversation

Zac-HD
Copy link
Contributor

@Zac-HD Zac-HD commented Nov 7, 2020

This patch adds a plugin which teaches Hypothesis how to generated examples of Pydantic's custom field types (which closes #2017). Note that there is no runtime impact or dependency; this module is imported by Hypothesis so it only exists at test-time and only when Hypothesis is both installed and actually being used.

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable
  • changes/<pull request or issue id>-<github username>.md file added describing change
    (see changes/README.md for details)

I've ended up dropping the integration for URLs, and for constrained lists and sets... but everything else is ready to go!

Why not support constrained lists and sets?

In short, I couldn't get them to play nicely with Hypothesis' type registry - because the runtime objects are subclasses of a parametrized generic type, and aren't always distinguished by our introspection logic 😭

Why I decided to leave our URLs (with code)

I spent a long time tweaking support for URLs, but ultimately gave it up: it's easy to generate valid URLs, unless you want to generally really strange ones which will find lots of bugs. I decided that it was better to make users register their own - and make the tradeoff explicit - than to cover up the complexity and have automatic but weak tests. This is a standard design choice for Hypothesis, sadly, so I've included my code so far below for the benefit of any future contributor who wants to pick it up.

def idna_encodable(s: str) -> bool:
    # We only need this because the regex patterns aren't fully precise; but
    # rejection sampling is a LOT easier to implement than precise patterns.
    try:
        s.encode('idna')
    except Exception:
        hypothesis.reject()  # type: ignore[no-untyped-call]
    return True


@resolves(pydantic.AnyUrl)
def resolve_anyurl(cls):  # type: ignore[no-untyped-def]
    domains = st.one_of(
        st.from_regex(ascii_domain_regex(), fullmatch=True),
        st.from_regex(int_domain_regex(), fullmatch=True).filter(idna_encodable),
    )
    if cls.tld_required:

        def has_tld(s: str) -> bool:
            assert isinstance(s, str)
            match = ascii_domain_regex().fullmatch(s) or int_domain_regex().fullmatch(s)
            return bool(match and match.group('tld'))

        hosts = domains.filter(has_tld)
    else:
        hosts = domains | st.from_regex(
            r'(?P<ipv4>(?:\d{1,3}\.){3}\d{1,3})' r'|(?P<ipv6>\[[A-F0-9]*:[A-F0-9:]+\])',
            fullmatch=True,
        )

    return st.builds(
        cls.build,
        scheme=(
            st.sampled_from(sorted(cls.allowed_schemes))
            if cls.allowed_schemes
            else st.from_regex(r'(?P<scheme>[a-z][a-z0-9+\-.]+)', fullmatch=True)
        ).filter(idna_encodable),
        user=st.one_of(
            st.nothing() if cls.user_required else st.none(),
            st.from_regex(r'(?P<user>[^\s:/]+)', fullmatch=True).filter(idna_encodable),
        ),
        password=st.none() | st.from_regex(r'(?P<password>[^\s/]*)', fullmatch=True).filter(idna_encodable),
        host=hosts,
        port=st.none() | st.integers(0, 2 ** 16 - 1).map(str),
        path=st.none() | st.from_regex(r'(?P<path>/[^\s?]*)', fullmatch=True).filter(idna_encodable),
        query=st.none() | st.from_regex(r'(?P<query>[^\s#]+)', fullmatch=True).filter(idna_encodable),
        fragment=st.none() | st.from_regex(r'(?P<fragment>\S+)', fullmatch=True).filter(idna_encodable),
    ).filter(lambda url: cls.min_length <= len(url) <= cls.max_length)


st.register_type_strategy(pydantic.AnyUrl, resolve_anyurl)
st.register_type_strategy(pydantic.AnyHttpUrl, resolve_anyurl)
st.register_type_strategy(pydantic.HttpUrl, resolve_anyurl)
st.register_type_strategy(pydantic.PostgresDsn, resolve_anyurl)
st.register_type_strategy(pydantic.RedisDsn, resolve_anyurl)
def gen_url_models():
    class AnyUrlModel(pydantic.BaseModel):
        anyurl: pydantic.AnyUrl

    class AnyHttpUrlModel(pydantic.BaseModel):
        anyhttp: pydantic.AnyHttpUrl

    class HttpUrlModel(pydantic.BaseModel):
        http: pydantic.HttpUrl

    class PostgresDsnModel(pydantic.BaseModel):
        postgres: pydantic.PostgresDsn

    class RedisDsnModel(pydantic.BaseModel):
        redis: pydantic.RedisDsn

    yield from (AnyUrlModel, AnyHttpUrlModel, HttpUrlModel, PostgresDsnModel, RedisDsnModel)


@pytest.mark.parametrize('model', gen_url_models())
@settings(suppress_health_check=[HealthCheck.filter_too_much, HealthCheck.too_slow])
@given(data=st.data())
def test_can_construct_urls_model(data, model):
    # This is a separate test because we want a minimal health-check exemption
    instance = data.draw(st.from_type(model))
    assert isinstance(instance, model)
`Literal[None]` requires Python 3.7+ or a recent version of Hypothesis

Because I literally just fixed that this evening - since it requires a backport on a security-only Python version, we delayed working out how to support it until we knew it would actually be used before the 3.6 EOL later this year.

@Zac-HD Zac-HD force-pushed the hypothesis-plugin branch 4 times, most recently from 55517f8 to e7a11a5 Compare November 7, 2020 08:26
@codecov
Copy link

codecov bot commented Nov 7, 2020

Codecov Report

Merging #2097 (6cece5c) into master (d0baf0f) will decrease coverage by 0.11%.
The diff coverage is 96.03%.

@@             Coverage Diff             @@
##            master    #2097      +/-   ##
===========================================
- Coverage   100.00%   99.88%   -0.12%     
===========================================
  Files           21       22       +1     
  Lines         4202     4323     +121     
  Branches       855      873      +18     
===========================================
+ Hits          4202     4318     +116     
- Misses           0        5       +5     
Impacted Files Coverage Δ
pydantic/_hypothesis_plugin.py 95.68% <95.68%> (ø)
pydantic/types.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d0baf0f...6cece5c. Read the comment docs.

@Zac-HD Zac-HD force-pushed the hypothesis-plugin branch 7 times, most recently from c54f1ba to 8232803 Compare November 8, 2020 07:54
@PrettyWood
Copy link
Member

Amazing work @Zac-HD! So glad to see this PR open!
Could you please check the coverage to make sure it remains at 100% please?

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Nov 27, 2020

@PrettyWood - thanks! I'm looking forward to seeing what people do with it 😃

There was one deliberately unreachable line (assertion that we returned from a loop body), which I've unrolled so we get 100% coverage without needing lots of pragmas.

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few small things, overall I think this is looking great.

docs/examples/hypothesis_property_based_test.py Outdated Show resolved Hide resolved
docs/hypothesis_plugin.md Outdated Show resolved Hide resolved
docs/hypothesis_plugin.md Outdated Show resolved Hide resolved
docs/hypothesis_plugin.md Outdated Show resolved Hide resolved
pydantic/_hypothesis_plugin.py Outdated Show resolved Hide resolved
pydantic/_hypothesis_plugin.py Show resolved Hide resolved
pydantic/_hypothesis_plugin.py Outdated Show resolved Hide resolved
pydantic/color.py Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
tests/test_hypothesis_plugin.py Outdated Show resolved Hide resolved
@Zac-HD Zac-HD mentioned this pull request Nov 30, 2020
4 tasks
@Zac-HD Zac-HD force-pushed the hypothesis-plugin branch 7 times, most recently from b835b9d to 1b10c1e Compare December 1, 2020 11:18
@Zac-HD
Copy link
Contributor Author

Zac-HD commented Dec 1, 2020

OK @samuelcolvin - I think I'm done, and have reverted the #2155 patch in favor of a test-only solution 😄

@lsorber
Copy link

lsorber commented Dec 20, 2020

@Zac-HD I was wondering, would it be possible to add generic support for constrained types like ConstrainedStr. For example:

st.register_type_strategy(
    ConstrainedStr,
    lambda T: st.from_regex(T.regex, fullmatch=True) if T.regex is not None else st.text(min_size=T.min_length, max_size=T.max_length)
)

Unfortunately, that type strategy is not picked up on for subclasses of ConstrainedStr:

class MyString(ConstrainedStr):
    regex = "[A-Z]{2,8}"

st.from_type(MyString).example()  # Returns empty string

This doesn't need to hold back this PR, but it is related so I thought I'd post it here.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Dec 20, 2020

I was all ready to explain why we couldn't, but in fact there is a way... it's just a little more complicated than you might expect. First, the obvious approach of registering ConstrainedStr doesn't do anything for child classes. That's correct, if unfortunate in this case - otherwise we could just return builds(object) for everything!

So the trick is to register a strategy for each child class when that class is created, which can be done from the __init__ method of a metaclass. The second trick is to ensure that this is a noop unless the user is already using Hypothesis, and that can be done with a WeakSet and a plugin (see https://github.com/Parquery/icontract/pull/181/files plus https://github.com/mristin/icontract-hypothesis/pull/5/files for an example).

I'll update this PR in a week or two, or if you'd like to merge it sooner I can open a follow-up instead 😁

@Zac-HD Zac-HD force-pushed the hypothesis-plugin branch 2 times, most recently from 0d5f569 to eb7aaca Compare December 30, 2020 13:56
@Zac-HD Zac-HD force-pushed the hypothesis-plugin branch 2 times, most recently from 81efff5 to 4e193cf Compare January 3, 2021 11:05
@PrettyWood
Copy link
Member

Great job again @Zac-HD! I'll probably use it as soon as v1.8 is released 🚀
For conset and conlist, I think it will be possible to support them in the plugin once rewritten with Generic[T]. We are currently waiting for cython to support it

Makefile Outdated Show resolved Hide resolved
@Zac-HD
Copy link
Contributor Author

Zac-HD commented Jan 4, 2021

Thanks for the review @PrettyWood! I've added your suggestions, along with even more comments to explain what's happening 😄

@samuelcolvin samuelcolvin merged commit 771b0d3 into pydantic:master Feb 11, 2021
@samuelcolvin
Copy link
Member

this is awesome, thank you so much! 🚀 🙏 🥳

I'm working through PRs now, v1.8 coming soon.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Feb 11, 2021

Woohoo! Pydantic is also awesome, I'm so excited about combining it with Hypothesis 😁

Can't wait to see what people do with this either :shipit:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generating Pydantic-specific types with Hypothesis
4 participants