Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Config.smart_union option #2092

Merged
merged 14 commits into from Dec 7, 2021

Conversation

PrettyWood
Copy link
Member

@PrettyWood PrettyWood commented Nov 4, 2020

Change Summary

Following this comment, here is a proposal for a new option smart_union that changes current behaviour of Union to avoid coercion if possible

from typing import Dict, List, Union

from pydantic import BaseModel


class Model(BaseModel, smart_union=True):
    v: Union[int, bool, str]

assert Model(v=1).dict() == {'v': 1}
assert Model(v=True).dict() == {'v': True}
assert Model(v='1').dict() == {'v': '1'}

@samuelcolvin I'll update the doc if you're fine with it

Related issue number

A bunch of closed ones 😄
closes #453
closes #1423
closes #3196
closes #3365

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable
  • changes/<pull request or issue id>-<github username>.md file added describing change
    (see changes/README.md for details)

@codecov
Copy link

codecov bot commented Nov 4, 2020

Codecov Report

Merging #2092 (76ae8df) into master (7b7e705) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 76ae8df differs from pull request most recent head 060a762. Consider uploading reports for the commit 060a762 to get more accurate results

@@            Coverage Diff             @@
##            master     #2092    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files           25        21     -4     
  Lines         5109      4149   -960     
  Branches      1050       834   -216     
==========================================
- Hits          5109      4149   -960     
Impacted Files Coverage Δ
pydantic/errors.py 100.00% <100.00%> (ø)
pydantic/mypy.py 100.00% <100.00%> (ø)
pydantic/types.py 100.00% <100.00%> (ø)
pydantic/json.py 100.00% <0.00%> (ø)
pydantic/utils.py 100.00% <0.00%> (ø)
pydantic/fields.py 100.00% <0.00%> (ø)
pydantic/schema.py 100.00% <0.00%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 415eb54...060a762. Read the comment docs.

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Humm, this looks good.

I just wonder if we should go the whole hog and implement SmartUnion which does this then tries sensible next steps.

Having:

  • Union
  • StrictUnion
  • LaxUnion
  • SmartUnion
  • DecriminatedUnion

Isn't very zen of python

There should be one-- and preferably only one --obvious way to do it.

I know we can't avoid having multiple, but I think we should aim at one type of union which is useful in as many cases as possible -- the behaviour we'll adopt for Union if v2.

For me that would be SmartUnion which does this and then falls back to the current union behaviour or something smarter.

See the discussion on #1423, e.g. #1423 (comment)

pydantic/mypy.py Outdated Show resolved Hide resolved
pydantic/types.py Outdated Show resolved Hide resolved
tests/mypy/outputs/plugin_success.txt Outdated Show resolved Hide resolved
@marco-neumann-by
Copy link

While trying out StrictUnion, I stumbled across the issue that PikaModel.schema() crashes because pydantic doesn't know what to do with the newly created StrictUnion type. Could you please add a test for that? This should probably just create the very same schema as a "normal" Union.

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise looks good, but what did you think about my suggestion above of SmartUnion?

tests/mypy/modules/plugin_success.py Outdated Show resolved Hide resolved
@PrettyWood
Copy link
Member Author

otherwise looks good, but what did you think about my suggestion above of SmartUnion?

I agree but I want to take more time to do it properly. And I also need to support schema like a regular union

@samuelcolvin
Copy link
Member

Okay, so do you think we should merge this or wait for those improvements?

@PrettyWood
Copy link
Member Author

I reckon we should wait. I'll try to work on it soon but I need to put everything in perspective and add some code + tests for schema, mypy...

@samuelcolvin
Copy link
Member

👍

@PrettyWood PrettyWood changed the title feat: add StrictUnion type [WIP] feat: add StrictUnion type Jan 17, 2021
@PrettyWood
Copy link
Member Author

PrettyWood commented Jan 20, 2021

@samuelcolvin Some news regarding this issue!! 🎉

I took some time today to give it another go after the opened issue on this matter at noon.
I'm not really happy with the current approach with a custom StrictUnion class. Since we can't inherit from Union or make a Generic with many types, I had to go with the metaclass approach but it forces me to support the schema generation, the mypy plugin and it creates a lot of false negative in the IDE.

I reckon the best approach is to keep the Union as is and use a config flag to make it "smart" (flag that will probably be switched to True by default or even removed in v2).
I really don't think people will mix Strict and not Strict unions inside the same model like I did in my tests.
The SmartUnion should be enough for all cases IMO.

We want to tackle simple cases like Union[str, int] but also more complex Union like Union[Dict[str, str], List[str]],
for which pydantic will currently coerce ["ab"] into {"a": "b"} by calling dict.
Lucky coincidence, I just created a package typingx a week ago that was first meant for unit tests to check data shape. The main function is isinstancex, an isinstance-like method that also works with most typing types.

Here is a POC that works well with it: https://github.com/PrettyWood/pydantic/pull/66/files.
As you can see it tackles more complex unions
I tested it against all closed issues regarding this matter (~20 issues) and it works perfectly.

But I don't want to add a library (especially when it's mine! 😆) randomly even though it would be an extra requirement and not a required one.
I could possibly change the chosen approach to avoid using this library if you're against it (I used it because it made my life easy)
I'll wait for your input on this matter 😄

@samuelcolvin
Copy link
Member

This is looking awesome, 🎉 I really want some of this functionality.

I'm a bit worried about adding another dependency, particularly because:

  • typingx duplicates a lot of the logic in fields.py
  • isinstancex is only used once (albeit in an important/commonly used place)
  • using smart_union doesn't "feel" like something that should require an extra dependency - it feels like a core part of pydantic
  • Although this is optional at the moment, it will become required in the next major release
  • This breaks the current logic that Unions (like List, Dict etc.) are implemented as a recursive group of fields which are queried in turn.

I know this is really annoying, but I would prefer to avoid the extra dependency if at all possible?

I would prefer to skip the second pass here or find another route. I'm welcome to input from others though...?

I think overall, the second pass will be less important as pydantic becomes stricter in v2.

@PrettyWood
Copy link
Member Author

@samuelcolvin to be honest adding the extra library was a quick win to test it. We can definitely just copy paste part of the logic directly in pydantic now or in v2 :)
I also don't want to add extra dependencies. I love libraries that don't come with a bunch of extra, especially when it's for a "core" functionality

@samuelcolvin
Copy link
Member

@PrettyWood makes sense.

@PrettyWood PrettyWood force-pushed the feat/strict-union branch 2 times, most recently from eeaca09 to 01515a5 Compare September 5, 2021 20:31
@PrettyWood
Copy link
Member Author

@djpugh I updated the doc and moved your section with default Union (not smart union) https://smokeshow.helpmanual.io/1q331m1k3x6v06512i6z/usage/types/#unions
Feedback welcome :)
@samuelcolvin I think it's ready for review

Comment on lines 249 to 258
!!! warning
`typing.Union` also ignores order when [defined](https://docs.python.org/3/library/typing.html#typing.Union),
so `Union[int, float] == Union[float, int]` which can lead to unexpected behaviour
when combined with matching based on the `Union` type order.
Please note that this can also be [affected by third party libraries](https://github.com/samuelcolvin/pydantic/issues/2835)
and their internal type definitions and the import orders.

As such, it is recommended that, when defining `Union` annotations, the most specific type is included first and
followed by less specific types. In the above example, the `UUID` class should precede the `int` and `str`
classes to preclude the unexpected representation as such:
followed by less specific types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PrettyWood - Suggest maybe flipping these two parts for now, as it comes across slightly confusing with the warning then the recommended approach (given the warning applies to the recommended approach), and then maybe refer to Smart Union as the recommended approach to resolve this?

Copy link
Member Author

@PrettyWood PrettyWood Sep 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with b3de1a3
See https://smokeshow.helpmanual.io/3f1a2w4h2n461d35541v/usage/types/#unions. WDYT?
I want to show some exposure on smart union because we got a lot of issues regarding this matter.
Furthermore it will probably be the default behaviour for v2

docs/usage/types.md Outdated Show resolved Hide resolved
PrettyWood and others added 3 commits September 6, 2021 12:26
Co-authored-by: David J Pugh <6003255+djpugh@users.noreply.github.com>
Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

@PrettyWood I know this has been a lonnnnng time, so I can make the change here to get this merged, but I want to leave it for you to check you're happy with it even though it's a pretty trival change.

@@ -935,6 +935,35 @@ def _validate_singleton(
) -> 'ValidateReturn':
if self.sub_fields:
errors = []

if is_union_origin(get_origin(self.type_)) and self.model_config.smart_union:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if is_union_origin(get_origin(self.type_)) and self.model_config.smart_union:
if self.model_config.smart_union and is_union_origin(get_origin(self.type_)):

Should be quicker since it's just checking a bool, if that is false we skip the is_union_origin check.

@PrettyWood
Copy link
Member Author

@samuelcolvin updated! Please review 🙏

@samuelcolvin samuelcolvin merged commit c38c463 into pydantic:master Dec 7, 2021
@samuelcolvin
Copy link
Member

🎉 thank you so much.

@PrettyWood PrettyWood deleted the feat/strict-union branch December 7, 2021 22:10
@andriilahuta
Copy link
Contributor

@PrettyWood I've been trying this in master and it crashes when TypedDict is involved:

class Dict1(TypedDict):
    foo: str


class Dict2(TypedDict):
    bar: str


class M(BaseModel):
    d: Union[Dict2, Dict1]

    class Config:
        smart_union = True


M(d=dict(foo='baz'))  # raises TypeError: isinstance() arg 2 must be a type, a tuple of types, or a union

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment