Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compound type (Union) gets preferentially parsed as string #3321

Closed
3 tasks done
amenck opened this issue Oct 13, 2021 · 8 comments
Closed
3 tasks done

compound type (Union) gets preferentially parsed as string #3321

amenck opened this issue Oct 13, 2021 · 8 comments
Labels
bug V1 Bug related to Pydantic V1.X

Comments

@amenck
Copy link

amenck commented Oct 13, 2021

Checks

  • I added a descriptive title to this issue
  • I have searched (google, github) for similar issues and couldn't find anything
  • I have read and followed the docs and still think this is a bug

Bug

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

             pydantic version: 1.7.3
            pydantic compiled: True
                 install path: /usr/local/lib/python3.7/site-packages/pydantic
               python version: 3.7.11 (default, Jul  6 2021, 12:42:50)  [Clang 12.0.0 (clang-1200.0.32.29)]
                     platform: Darwin-19.6.0-x86_64-i386-64bit
     optional deps. installed: ['typing-extensions', 'email-validator']
from pydantic.dataclasses import dataclass
from typing import Dict, Union

@dataclass
class MyDataclass:
	c: Dict[str, Union[str, bool, int]]

print(MyDataclass(c={"a": 1, "b": True, "c": "d"}))

output: MyDataclass(c={'a': '1', 'b': 'True', 'c': 'd'})
expected output: MyDataclass(c={'a': 1, 'b': True, 'c': 'd'})

As you can see from the repro above, it seems that the str component of the Union. If I change the order to [bool, int, str], the output becomes: MyDataclass(c={'a': True, 'b': True, 'c': 'd'}). I'm not familiar at all with how the underlying code works, but my guess is that this is giving preference to types given their order in the Union. I find this behavior quite odd, especially considering that the objects in the input already conform to the types specified in the dataclass. Is this intended behavior?

@amenck amenck added the bug V1 Bug related to Pydantic V1.X label Oct 13, 2021
@tuchandra
Copy link

I believe this is intended. From the Unions section of the docs, emphasis mine:

However, as can be seen above, pydantic will attempt to 'match' any of the types defined under Union and will use the first one that matches. In the above example the id of user_03 was defined as a uuid.UUID class (which is defined under the attribute's Union annotation) but as the uuid.UUID can be marshalled into an int it chose to match against the int type and disregarded the other types.

@amenck
Copy link
Author

amenck commented Oct 14, 2021

Thanks for the quick reply and explanation! I would suggest changing this behavior to simply not change any arguments that are already instances one of the allowed types in the union. The best argument I can think of for making this change is the following:

The docs you linked above continue on to say:

As such, it is recommended that, when defining Union annotations, the most specific type is included first and followed by less specific types.

However, it is not always clear what "the most specific type" is. In the example I gave above, we can see that using Union[bool, int, str] results in MyDataclass(c={'a': True, 'b': True, 'c': 'd'}). However, changing the order to Union[int, bool, str] will get me MyDataclass(c={'a': 1, 'b': 1, 'c': 'd'}). I'm not sure how I could arrange my type specification to produce the output I would want, in this case.

@PrettyWood
Copy link
Member

@amenck #2092 will help ;)

@amenck
Copy link
Author

amenck commented Oct 14, 2021

Awesome, that will solve it. Thank you!

@mwgamble
Copy link
Contributor

I believe this is intended. From the Unions section of the docs, emphasis mine:

The ordering of unions is supposed to be ignored:

https://github.com/python/cpython/blob/main/Lib/typing.py#L564

@tuchandra
Copy link

I believe this is intended. From the Unions section of the docs, emphasis mine:

The ordering of unions is supposed to be ignored:

https://github.com/python/cpython/blob/main/Lib/typing.py#L564

The ordering is ignored when a type checker is reading an annotation to determine what type a variable is. If we have x: str | int, then yeah, that's equivalent to x: int | str.

But that's not what pydantic is doing. Pydantic is a parsing library, so when it sees that a value is supposed to be parsed into a Union, it has to decide which member to use. It can't parse something into a str and int simultaneously. It uses the order to make that decision in a predictable way.

@tuchandra
Copy link

tuchandra commented Feb 10, 2022

Regardless, though, looks like #2092 has landed!

@mwgamble
Copy link
Contributor

But that's not what pydantic is doing.

Yes, I eventually worked that out after hours of pulling my hair out. Unfortunately for someone who's very familiar with how mathematical unions are supposed to work, this behaviour was extremely surprising.

@amenck amenck closed this as completed Feb 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V1 Bug related to Pydantic V1.X
Projects
None yet
Development

No branches or pull requests

4 participants