Allow different serializer to an `Annotated` type for "python" and "json" mode #8086

ChillPC · 2023-11-11T08:22:26Z

Initial Checks

I have searched Google & GitHub for similar requests and couldn't find anything
I have read and followed the docs and still think this feature is missing

Description

A demo of how code might look when using the feature

Either give the possibility to have as many mode of serialization you want :

WeirdInt = Annotated[
    int,
    PlainSerializer(lambda i: f"{i} in mode python", return_type=str, mode={"python"}),
    PlainSerializer(lambda i: f"{i} in mode json", return_type=str, mode={"json"}),
    PlainSerializer(lambda i: f"{i} in mode my-custom-mode", return_type=str, mode={"my-custom-mode"}),

    PlainSerializer(lambda i: f"{i} in mode mode1 or mode2", return_type=str, mode={"mode1", "mode2"}),
]

class WeirdIntModel(BaseModel):
    i: WeirdInt

assert WeirdIntModel(i = 1).i == i
assert WeirdIntModel(i = 1).model_dump() == "1 in mode python"
assert WeirdIntModel(i = 1).model_dump(mode="json") == "1 in mode json" # Used by `.model_dump_json()`
assert WeirdIntModel(i = 1).model_dump(mode="my-custom-mode") == "1 in mode my-custom-mode"

assert WeirdIntModel(i = 1).model_dump(mode="mode1") == "1 in mode mode1 or mode2"
assert WeirdIntModel(i = 1).model_dump(mode="mode2") == "1 in mode mode1 or mode2"

Or at least differentiate the mode "python" and "json" :

WeirdInt = Annotated[
    int,
    PlainSerializer(lambda i: f"{i} in mode python", return_type=str),
    PlainSerializer(lambda i: f"{i} in mode json", return_type=str, when_used="json"), # Do not erase previous serializer
]

class WeirdIntModel(BaseModel):
    i: WeirdInt

assert WeirdIntModel(i = 1).i == i
assert WeirdIntModel(i = 1).model_dump() == "1 in mode python"
assert WeirdIntModel(i = 1).model_dump(mode="json") == "1 in mode json"

Your use case(s) for the feature

I work with mongodb and work with dates. It needs to have 3 different forms:

datetime.date in the business logic
datetime.datetime when dumping into mongo because there is no bson type representing only the date part of a datetime
str in format YYYYMMDD for the api

My first idea was to have multiple serializer on an Annotated type like this :

def validate_date(v: Any) -> date:
    if isinstance(v, datetime):
        return v.date()
    if isinstance(v, date):
        return v

    match v:
        case str(s):
            return date.fromisoformat(s)
        case int(x) | float(x):
            return date.fromtimestamp(x)
    raise ValueError(
        f"'{v} should be a valid date, a string in Iso8601 format "
        "or an integer/float of an epoch timestamp in seconds."
    )

GtfsDate = Annotated[
    date,
    BeforeValidator(validate_date),
    PlainSerializer(lambda d: datetime.combine(d, time()), return_type=datetime),
    PlainSerializer(lambda d: d.strftime("%Y%m%d"), return_type=str, when_used="json"),
    WithJsonSchema({"type": "str"}, mode="serialization"),
]

I thought that the 2nd PlainSerializer would override the 1st one only on "json" mode but serializing a BaseModel with this field give :

class T(BaseModel):
    d: GtfsDate

T(d=date.today()).model_dump()                      # => {'d': datetime.date(2023, 11, 10)} instead of {'d': datetime.datetime(2023, 11, 10, 0, 0)}
T(d=date.today()).model_dump(mode="json")  # => # {'d': '20231110'}

Why the feature should be added to pydantic (as opposed to another library or just implemented in your code)

This touch the serialization on the field level and not the model level. Custom user code would certainly be too cumbersome.

Affected Components

The text was updated successfully, but these errors were encountered:

sydney-runkle · 2023-11-13T15:57:05Z

Hi @ChillPC,

This seems like a great idea. You've brought up some great points about varied use cases for this kind of feature.

Do you have any interest in opening a PR adding support for this kind of logic? Perhaps this is something we could fit into our next minor release 😄.

ChillPC · 2023-11-14T16:52:45Z

Hello @sydney-runkle !

I sure would be interested but :

the code-base is quite large and I will certainly need help to grok it
I don't think that it would be easy to implement this feature
It will certainly touch to the rust pydantic-core code

Difficulties

BaseModel.model_dump(mode: Literal['json', 'python'] | str = 'python') is flexible enough so it would not be a problem.

The problem rise on the signature of the PlainSerializer with its when_used that is not flexible enough. It goes all the way to pydantic-core in src/serializers/type_serializers/format.rs.

In the definition of PlainSerializer there is :

schema['serialization'] = core_schema.plain_serializer_function_ser_schema(
    ...,
    when_used=self.when_used,
    ...
)

Would it be acceptable to store the newly created plain_serializer... into schema['serialization'][<name_of_mode>] ? Would it be in the rust part ?

Api consideration

For PlainSerializer (and WrapSerializer), the signature of when_used could be changed to something like that, thus keeping it retro compatible and translating the literals to a set object :

when_used: Literal['always', 'unless-none', 'json', 'json-unless-none'] | set[str] = {'python', 'json'}

But the set[str] would not act on the possibility of handling None like with unless-none. Should the distinction be added with a pre/suffix? A frozen dataclass that has when: str and unless_none: bool as the key of the dict ? Would it be "easy" to implement such a map in rust ?

And what about the key for the serializer to fallback to ? Would it be a magic string like "always" or "default" ?

Sorry if it is a lot of question 😅 I just want to be sure I am going on the right direction

sshishov · 2023-11-29T17:19:15Z

Also has this issue today.

We are allowing to serializer the model into python and json string. Why we do not allow the same keys for PlainSerializer? Why we allow always and json? Where is python?
Imho it is very big oversight from the core team thinking that the DATA stored in the model should be ALWAYS serializerd into python AS IS, what is not true in a lot of cases.

I would propose to have 2 different variables instead of a lot of Literals:

mode: python, json, always
unless_none: True, False

Am I missing something?

sydney-runkle · 2023-12-04T15:51:50Z

@ChillPC,

Apologies for the delay. We're excited that you're going to help implement this!

I think that @davidhewitt will be the best person to answer these questions. Specifically, DH, what do you think about these two inquiries?

Would it be acceptable to store the newly created plain_serializer... into schema['serialization'][<name_of_mode>] ? Would it be in the rust part ?

In other words, should we expand the definition of when_used, or expand the locations in which we store serialization schema as suggested above?

But the set[str] would not act on the possibility of handling None like with unless-none. Should the distinction be added with a pre/suffix? A frozen dataclass that has when: str and unless_none: bool as the key of the dict ? Would it be "easy" to implement such a map in rust ?

Good question. I think having two indicators here could be useful.

And what about the key for the serializer to fallback to ? Would it be a magic string like "always" or "default" ?

The PlainSerializer and WrapSerializer types default to 'always', so I think the answer to your question is yes.

Feel free to reach out if you have more questions. I'll be much more prompt with responses moving forward.

davidhewitt · 2023-12-05T07:39:03Z

I think there are two different feature requests here, and let's separate them. Custom modes like mode={"mode1", "mode2"} might be a lot of work, so let's keep that out of this issue and discuss that elsewhere if its really needed.

For "python" and "json" mode, this is already supported in pydantic_core by the json_or_python schema. If needed it could be built with the same validator for each mode but different serialization schemas for each mode, e.g.:

schema_python = some_validation_schema()
schema_json = schema_python.copy()
schema_python['serialization'] = python_serializer
schema_json['serialization'] = json_serializer

core_schema.json_or_python_schema(json_schema=schema_json, python_schema=schema_python)

So I think this can be implemented without any Rust changes, just need to work out a desirable way to expose this in the Pydantic API and build a schema like the above.

ChillPC · 2023-12-22T19:09:35Z

Hello @davidhewitt , json_or_python_schema seems to act weirdly. See my PR for details.

ChillPC added the feature request label Nov 11, 2023

sydney-runkle self-assigned this Nov 13, 2023

sydney-runkle added the help wanted Pull Request welcome label Nov 15, 2023

ChillPC linked a pull request Dec 22, 2023 that will close this issue

Allow different serializer to an Annotated type for python and json mode #8432

Open

5 tasks

rumbarum mentioned this issue Feb 6, 2024

CustomType not working as expected (with arrow) #8737

Closed

1 task

sydney-runkle mentioned this issue Mar 17, 2024

Provide a mode parameter in validate_python and model_validate #9009

Open

13 tasks

sydney-runkle added this to the v2.8.0 milestone Mar 17, 2024

sydney-runkle mentioned this issue Mar 26, 2024

(WIP) high priority / high impact work 🚀 #9102

Open

21 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow different serializer to an `Annotated` type for "python" and "json" mode #8086

Allow different serializer to an `Annotated` type for "python" and "json" mode #8086

ChillPC commented Nov 11, 2023 •

edited

sydney-runkle commented Nov 13, 2023

ChillPC commented Nov 14, 2023 •

edited

sshishov commented Nov 29, 2023

sydney-runkle commented Dec 4, 2023

davidhewitt commented Dec 5, 2023 •

edited

ChillPC commented Dec 22, 2023 •

edited

Allow different serializer to an Annotated type for "python" and "json" mode #8086

Allow different serializer to an Annotated type for "python" and "json" mode #8086

Comments

ChillPC commented Nov 11, 2023 • edited

Initial Checks

Description

A demo of how code might look when using the feature

Your use case(s) for the feature

Why the feature should be added to pydantic (as opposed to another library or just implemented in your code)

Affected Components

sydney-runkle commented Nov 13, 2023

ChillPC commented Nov 14, 2023 • edited

Difficulties

Api consideration

sshishov commented Nov 29, 2023

sydney-runkle commented Dec 4, 2023

davidhewitt commented Dec 5, 2023 • edited

ChillPC commented Dec 22, 2023 • edited

Allow different serializer to an `Annotated` type for "python" and "json" mode #8086

Allow different serializer to an `Annotated` type for "python" and "json" mode #8086

ChillPC commented Nov 11, 2023 •

edited

ChillPC commented Nov 14, 2023 •

edited

davidhewitt commented Dec 5, 2023 •

edited

ChillPC commented Dec 22, 2023 •

edited