Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First instantiation of generic Union model defines type coercion of models instantiated later #4474

Closed
7 of 16 tasks
sveinugu opened this issue Sep 3, 2022 · 17 comments · Fixed by #4482
Closed
7 of 16 tasks
Labels
bug V1 Bug related to Pydantic V1.X

Comments

@sveinugu
Copy link
Contributor

sveinugu commented Sep 3, 2022

Initial Checks

  • I have searched GitHub for a duplicate issue and I'm sure this is something new
  • I have searched Google & StackOverflow for a solution and couldn't find anything
  • I have read and followed the docs and still think this is a bug
  • I am confident that the issue is with pydantic (not my code, or another library in the ecosystem like FastAPI or mypy)

Description

When pydantic generic models are instantiated with Unions, the order of a particular set of types are fixed by the first instantiation, e.g.:

from typing import Generic, TypeVar, Union

from pydantic.generics import GenericModel

NumberT = TypeVar('NumberT')


class NumberModel(GenericModel, Generic[NumberT]):
    data: NumberT


FloatOrIntModel = NumberModel[Union[float, int]]
print(FloatOrIntModel(data='1').data)

IntOrFloatModel = NumberModel[Union[int, float]]
print(IntOrFloatModel(data='1').data)

prints

1.0
1.0

While changing the order of the instantiations changes type coercion of both models:

1
1

In pydantic, type coercion of Union models depends on the order of the types. However, as explained in the documentation on Unions:

typing.Union also ignores order when defined, so Union[int, float] == Union[float, int] which can lead to unexpected behaviour when combined with matching based on the Union type order inside other type definitions

I think this should be considered a bug in pydantic, even though the user is warned against 'unexpected behaviour' combined with other code that depend on Union type equality. The unexpected behaviour in this case is caused by pydantic itself, there is no "matching based on the Union type order inside other type definitions" in the example code, nor are any third-party code imported. It is not difficult to envision situations where this dependency could cause very hard-to-find bugs.

It seems the cause of the error is the _generic_types_cache in generics.py, which caches the first Union model and uses this as the blueprint of later matching models (i.e. with the same set of types).

Setting smart_union to True has no impact.

Related to #2835.

Example Code

No response

Python, Pydantic & OS Version

pydantic version: 1.10.1
            pydantic compiled: True
                 install path: /Users/sveinugu/Library/Caches/pypoetry/virtualenvs/unifair-myuXak-6-py3.10/lib/python3.10/site-packages/pydantic
               python version: 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:42:03) [Clang 12.0.1 ]
                     platform: macOS-11.2.3-arm64-arm-64bit
     optional deps. installed: ['typing-extensions']

Affected Components

@sveinugu sveinugu added bug V1 Bug related to Pydantic V1.X unconfirmed Bug not yet confirmed as valid/applicable labels Sep 3, 2022
@PrettyWood
Copy link
Member

Yep it's a bug. Since for python Union[int, float] == Union[float, int], the key in the cache is the same.
I guess we could use (cls, params, get_args(params)) as key of _generic_types_cache instead of (cls, params)

@sveinugu
Copy link
Contributor Author

sveinugu commented Sep 4, 2022

Yep it's a bug. Since for python Union[int, float] == Union[float, int], the key in the cache is the same.

Good to hear!

I guess we could use (cls, params, get_args(params)) as key of _generic_types_cache instead of (cls, params)

What about nested models, say List[Union[float, int]] and List[Union[int, float]]? Wouldn’t these still be considered equal?

@samuelcolvin
Copy link
Member

Is this new in v1.10?

@sveinugu
Copy link
Contributor Author

sveinugu commented Sep 4, 2022

Is this new in v1.10?

No, saw it first in 1.9.0 and then updated to check if it was still there in the newest version. It is probably as old as the current specification of the key in _generic_types_cache.

@samuelcolvin
Copy link
Member

Humm, can't remember if that was new in 1.9 or 1.8.

The question is whether we should fix this in a patch release of 1.10 or wait for V2?

@sveinugu
Copy link
Contributor Author

sveinugu commented Sep 4, 2022

The question is whether we should fix this in a patch release of 1.10 or wait for V2?

That’s obviously not for me to say, but personally I think it would have been nice with a patch of at least the basic (non-nested) case, as the bug is breaking several of my tests.

@sveinugu
Copy link
Contributor Author

sveinugu commented Sep 5, 2022

This gets more complicated, as the more-or-less exact same issue is present in the typing library itself, e.g.

from typing import get_args, List, Union

print(get_args(Union[int, float]))
print(get_args(Union[float, int]))
print(get_args(List[Union[float, int]]))
print(get_args(List[Union[int, float]]))

Prints:

(<class 'int'>, <class 'float'>)
(<class 'float'>, <class 'int'>)
(typing.Union[float, int],)
(typing.Union[float, int],)

This is discussed in this CPython issue, which include comments by Guido, and which resulted in the following documentation change:

If X is a union or Literal contained in another generic type, the order of (Y, Z, ...) may be different from the order of the original arguments [Y, Z, ...] due to type caching.

(from the current Python docs)

This at least rules out a recursive solution using get_args for nested models (which I was toying around with). If I understand this correctly, I believe it also means that the difference between e.g. List[Union[float, int]] and List[Union[int, float]] would be inaccessible in Python itself, which is a problem.

Thinking from the perspective of a downstream library depending on pydantic I suppose one could implement a workaround for the nested problem(i.e. List[Union[int, float]] == List[Union[float, int]]) by replacing Union with a custom model, say OrderedUnion, which overrides __eq__. Such a model could even be included in pydantic if there is great enough need for this.

However, all of this does not disqualify the simple get_args fix suggested by @PrettyWood. It would still work for the top level Union case, as in my example code, which is really the only thing I need for my code anyway. Also, I suppose the fact that a solution to the nested problem would be dependent on the above-mentioned "feature" in Python is really an argument for only solving the simple non-nested issue now. So, in a sense, I would argue that this complication might actually make this issue simpler to manage, at least for now.

@samuelcolvin samuelcolvin removed the unconfirmed Bug not yet confirmed as valid/applicable label Sep 5, 2022
@samuelcolvin
Copy link
Member

Confirmed, I think we should fix this in V1.10.

PR welcome, it'll need to be pretty quick to make it into the next patch release #4472, but otherwise it can be included in the (inevitable) next patch release.

@sveinugu
Copy link
Contributor Author

sveinugu commented Sep 5, 2022

I can give it a try.

@samuelcolvin
Copy link
Member

great, thanks.

@sveinugu
Copy link
Contributor Author

sveinugu commented Sep 5, 2022

I'm getting the following mypy errors:

pydantic/generics.py:65: error: Argument 1 to "get_args" has incompatible type "Union[Type[Any], Tuple[Type[Any], ...]]"; expected "Type[Any]"  [arg-type]
pydantic/generics.py:131: error: Argument 1 to "get_args" has incompatible type "Tuple[Type[Any], ...]"; expected "Type[Any]"  [arg-type]

Which seems to be a bug in mypy: python/mypy#4625

Any thoughts on how to handle this?

@samuelcolvin
Copy link
Member

hard without seeing the change, create the PR so I can see it.

@sveinugu
Copy link
Contributor Author

sveinugu commented Sep 5, 2022

Basically, the type of the params parameter specified in GenericModel.__class_getitem__ is incompatible with the Type[Any] in the get_args method, according to mypy. As far as I can understand, Type[Any] is supposed to allow any type, including Union and Tuple. It makes sense to me that this is a mypy bug, but I might have misread the mypy issue, as the issue is not exactly the same.

Anyway, I'll add a test and submit the PR for you to see.

@samuelcolvin
Copy link
Member

I'd love to include this fix in v1.10.2, any chance you can submit the PR asap so I can review and merge it?

@sveinugu
Copy link
Contributor Author

sveinugu commented Sep 5, 2022

Sorry for the delay. The mypy issue caused the commit to fail, so had to do some acrobatics to be able to create the PR

@samuelcolvin
Copy link
Member

just add -n on commit.

@sveinugu
Copy link
Contributor Author

sveinugu commented Sep 5, 2022

just add -n on commit.

Next time I'll do that... 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V1 Bug related to Pydantic V1.X
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants