Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2: Descriminated Unions #4675

Closed
samuelcolvin opened this issue Oct 27, 2022 · 22 comments · Fixed by #5051
Closed

V2: Descriminated Unions #4675

samuelcolvin opened this issue Oct 27, 2022 · 22 comments · Fixed by #5051
Assignees
Labels
Change Suggested alteration to pydantic, not a new feature nor a bug feature request Feedback Wanted help wanted Pull Request welcome

Comments

@samuelcolvin
Copy link
Member

Descriminated unions are getting lots of super powers in V2, probably best example right now is the tests in pydantic-core, see https://github.com/pydantic/pydantic-core/blob/main/tests/validators/test_tagged_union.py.

In particular, descriminated union desciminators can now be a path or a function.

How should we define these? I guess it can all be done with Field and Annotated?

@samuelcolvin samuelcolvin added feature request help wanted Pull Request welcome Change Suggested alteration to pydantic, not a new feature nor a bug Feedback Wanted labels Oct 27, 2022
@smhavens
Copy link

Would it be alright for myself and @marcoo47 to request taking this one over as first time contributors? If so, what would you recommend for first steps, we've been reading over the repository code and documentation in the manual.

@samuelcolvin
Copy link
Member Author

Absolutely, have a go. You might also want to read through pydantic-core and understand how new descriminated unions work. I'll give some code examples of the final interface soon.

@smhavens
Copy link

Thank you so much!

@marcoo47
Copy link

Hi, my partner and I have read through most of the relevant documentation. Did you ever make the code examples for the final interface you mentioned?

@samuelcolvin
Copy link
Member Author

Interface should be as follows:

from typing import Literal, Union, Annotated

from pydantic import BaseModel, Field
from pydantic.descriminators import Tag


class Cat(BaseModel):
    meows: int


class Dog(BaseModel):
    barks: float


class Lizard(BaseModel):
    scales: bool


class Model(BaseModel):
    pet: Union[
        Annotated[Cat, Tag('cat')],
        Annotated[Dog, Tag('dog')],
        Annotated[Lizard, Tag('lizard')],
    ] = Field(..., discriminator='pet_type')
    n: int

A few notes:

  • Tag is just a dataclass used as the key in tagged-union `choices
  • Field(discriminator...) can also now be a function as that's supported by pydantic-core
  • There's a question over whether we can infer Tag in the case that all models have a literal with the same name, as we do now
  • once Serialization pydantic-core#327 get's merged, pydanticc-core will support multiple keys to the same choice, but for now we need to either raise an error for that case and wait for that feature, or duplicate the union member with the different key - e.g. in the pet_type: Literal['reptile', 'lizard'] case shown in the docs.

@smhavens
Copy link

I was wondering how you would like paths and functions to work for discriminated unions. Would they be like the variables being used to determine class like above (a function within the list of classes/objects to be decided from) or an outside function that is run to determine the object instead?

@samuelcolvin
Copy link
Member Author

I don't understand, it's either a string, a list, or a function. Shouldn't matter to your logic.

@smhavens
Copy link

Alright, I understand after double checking the schema.

@smhavens
Copy link

smhavens commented Dec 1, 2022

Are there any example cases of the discriminator being a path or function?

@samuelcolvin
Copy link
Member Author

Yes lots in the tests.

@marcoo47
Copy link

marcoo47 commented Dec 2, 2022

We've looked at the tests, use cases, examples, and other guides for discriminated unions. I think we misinterpreted the initial issue as needing to implement discriminated-unions with path and function cases but it looks like that's already done in pydantic core based on the attached test case. I guess we're just unsure about what your expectations were from our contribution.

@samuelcolvin
Copy link
Member Author

Yes, it's all implemented in pydantic-core. This issue, and most issues in pydantic related to V2 about about supporting the functionality in pydantic, using type hints, then updating and extending the unit tests to work.

I think everything in tests/test_discrimated_union.py is currently marked as xfail, these tests should all pass or be updated to pass.

@philhchen
Copy link

Hi @samuelcolvin I'd like to take this over. I think I have a decent idea of how to start and can open a PR in the next few days.

@samuelcolvin samuelcolvin assigned philhchen and unassigned smhavens Dec 29, 2022
@samuelcolvin
Copy link
Member Author

Great, go for it.

@philhchen
Copy link

philhchen commented Dec 30, 2022

From the blog and #4883 (comment), it seems like __root__ is being deprecated. What would be the alternative way of defining models for tests like

def test_discriminated_union_root_same_discriminator():
    class BlackCat(BaseModel):
        pet_type: Literal['blackcat']

    class WhiteCat(BaseModel):
        pet_type: Literal['whitecat']

    class Cat(BaseModel):
        __root__: Union[BlackCat, WhiteCat]

    class Dog(BaseModel):
        pet_type: Literal['dog']

    with pytest.raises(PydanticUserError, match="Field 'pet_type' is not the same for all submodels of 'Cat'"):

        class Pet(BaseModel):
            __root__: Union[Cat, Dog] = Field(..., discriminator='pet_type')

@philhchen
Copy link

philhchen commented Jan 4, 2023

hey @samuelcolvin ping on the above question! I have most of it figured out, but I'm a bit blocked on how you were envisioning replacing __root__ models.

@samuelcolvin
Copy link
Member Author

You're right, __root__ is going.

In that specific case, you could just replace class Cat... with

Cat = Union[BlackCat, WhiteCat]

and everything should work as expected.


For the general case of replacing __root__, the short answer is #4669.

So a simplified variant of your example would become

from pydantic import BaseModel, validate

class Cat(BaseModel):
    pet_type: Literal['cat']

class Dog(BaseModel):
    pet_type: Literal['dig']

Pet = Cat | Dog
ValidatedPet = validate(Pet)

The question then is: what is ValidatedPet? I'll try to answer that on #4669 now.

@samuelcolvin
Copy link
Member Author

to be clear, this doesn't really have anything to do with discriminated unions, except that I guess validate will take a bunch of kwargs, including discriminator like Field.

@philhchen
Copy link

Slight delay here, but I've made some progress with roughly half the tests passing. Some questions:

  1. In this unit test, we need to set from_attributes=True in order to properly validate the inner object Top(sub=A(foo='a')), or else the validator attempts to parse A(foo='a') as a typed dict. I assume we don't want to globally set from_attributes=True in generate_config, but adding from_attributes to TaggedUnionSchema would require changing the rust core as well, since it currently would throw an ErrorType::ExtraForbidden. How do you suggest I allow pass in from_attributes = True into the validator of the tagged union?
  2. When the discriminator types are literal ints (or something other than strings), like in this unit test, how should we populate 'choices' in the schema? If we convert ints and other objects to strings, then there could be the tricky case where '1' and 1 could technically be different tags for a discriminated union.

@philhchen
Copy link

Here's a draft PR with my progress so far. Depending on the answers to my questions above, I think I might need to make some edits to pydantic-core, which I would be happy to do if necessary.

@samuelcolvin
Copy link
Member Author

Thanks so much @philhchen for digging into this. These seem like really insightful questions.

  1. from_attributes

Good point, I guess we'll need to change the behaviour of the tagged union validator in pydantic-core to assume from_attributes=True when looking up discriminators, even if from_attributes=True is not set otherwise

  1. string choices

Again good catch, you'll need to change:

  • the type to Required[Dict[Union[str, int], Union[Union[str, int], CoreSchema]]]
  • This to AHashMap<ChoiceKey, CombinedValidator> where ChoiceKey is an enum of String, isize with Hash derived

There might be some more nuance, e.g. for other discriminator types, but let's start with the obvious choices.


If you can create a PR (or two PRs) for this on pydantic-core that would be amazing, otherwise I can look into it.

@philhchen
Copy link

Thanks for the help! Is there a reason to use isize instead of i64? The methods strict_int and lax_int return i64, so I think it may be more straightforward to keep the ChoiceKey the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Change Suggested alteration to pydantic, not a new feature nor a bug feature request Feedback Wanted help wanted Pull Request welcome
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants