Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change in behaviour with JSON serialization between v1.8.2 and v1.9.1 #4255

Closed
3 tasks done
liiight opened this issue Jul 19, 2022 · 14 comments
Closed
3 tasks done

Change in behaviour with JSON serialization between v1.8.2 and v1.9.1 #4255

liiight opened this issue Jul 19, 2022 · 14 comments
Labels
bug V1 Bug related to Pydantic V1.X

Comments

@liiight
Copy link
Contributor

liiight commented Jul 19, 2022

Checks

  • I added a descriptive title to this issue
  • I have searched (google, github) for similar issues and couldn't find anything
  • I have read and followed the docs and still think this is a bug

Bug

When converting to JSON via the .json() method, I used to override dict() in order to override behavior for different fields with the same type. Since 1.9.2 this does not seem to work anymore, as it appears that .json() does not call dict() internally. This looks to me like its related to #3542 but I'm not entirely sure if I'm doing something wrong or misunderstanding.

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

ocarmi@FVFGV4KVQ05P shield % python -c "import pydantic.utils; print(pydantic.utils.version_info())"
             pydantic version: 1.9.1
            pydantic compiled: True
                 install path: /Users/ocarmi/PycharmProjects/shield/venv/lib/python3.10/site-packages/pydantic
               python version: 3.10.4 (v3.10.4:9d38120e33, Mar 23 2022, 17:29:05) [Clang 13.0.0 (clang-1300.0.29.30)]
                     platform: macOS-12.4-arm64-arm-64bit
     optional deps. installed: ['email-validator', 'typing-extensions']

from pydantic import BaseModel
from datetime import datetime


class ExportTest(BaseModel):
    datetime_1: datetime = datetime.utcnow()
    datetime_2: datetime = datetime.utcnow()

    def dict(self, **kwargs):
        data = super().dict()
        data['datetime_1'] = data['datetime_1'].isoformat()
        data['datetime_2'] = data['datetime_2'].isoweekday()
        return data


e = ExportTest()
print(e.dict())
print(e.json())

Output for v1.9.1:

{'datetime_1': '2022-07-19T13:13:08.809973', 'datetime_2': 2}
{"datetime_1": "2022-07-19T13:13:08.809973", "datetime_2": "2022-07-19T13:13:08.810791"}

Output for v1.8.2:

{'datetime_1': '2022-07-19T13:58:09.047123', 'datetime_2': 2}
{"datetime_1": "2022-07-19T13:58:09.047123", "datetime_2": 2}

Also, Is there any other way to define a different serialization for the same type but on different fields? If so,I was not able to find it and I apologize.

@liiight liiight added the bug V1 Bug related to Pydantic V1.X label Jul 19, 2022
@liiight liiight changed the title Regression with JSON conversion when moving from 1.8.2 to 1.9.1 Change in behaviour with JSON serialization between v1.8.2 and v1.9.1 Jul 19, 2022
@hramezani
Copy link
Member

I think the behavior changed in 458f257

you can use Custom JSON serialisation

from pydantic import BaseModel
from datetime import datetime


def custom_dumps(v, *, default):
    v['datetime_1'] = v['datetime_1'].isoformat()
    v['datetime_2'] = v['datetime_2'].isoweekday()
    return v

class ExportTest(BaseModel):
    datetime_1: datetime = datetime.utcnow()
    datetime_2: datetime = datetime.utcnow()

    def dict(self, **kwargs):
        data = super().dict()
        data['datetime_1'] = data['datetime_1'].isoformat()
        data['datetime_2'] = data['datetime_2'].isoweekday()
        return data

    class Config:
        json_dumps = custom_dumps

e = ExportTest()
print(e.dict())
print(e.json())

Output:

{'datetime_1': '2022-07-26T21:28:38.785185', 'datetime_2': 2}
{'datetime_1': '2022-07-26T21:28:38.785185', 'datetime_2': 2}

@liiight
Copy link
Contributor Author

liiight commented Jul 27, 2022

@hramezani thank you for pointing that out, i was not aware of that usage of json_dumps
However, I could not find this change in the changelog, and this is a breaking change for me. I wonder how this change is viewed upon by the pydantic team

@samuelcolvin
Copy link
Member

I'm sorry about the breaking change, it was not intentional, we should have thought about it more and been more careful. 🙏

I don't think we can revert, the real solution is to be much stricter about breaking changes after V2 and consequently, make more regular major releases.

@liiight
Copy link
Contributor Author

liiight commented Jul 27, 2022

@samuelcolvin no worries, I greatly appreciate the work you guys are doing on this amazing project!
I think it's worthwhile to mention this change in the changelog though, not exactly sure how to best represent this though

@samuelcolvin
Copy link
Member

samuelcolvin commented Jul 27, 2022

PR welcome to update the changelog.

I'm hoping to work full time on pydantic from next week once I get the first release of pydantic-core out and give arq some much needed love.

@liiight
Copy link
Contributor Author

liiight commented Jul 27, 2022

I'll create the PR and close this issue accordingly.
Thanks for the swift reply!

@liiight
Copy link
Contributor Author

liiight commented Jul 31, 2022

After further examination, the provided solution does not seem to be as straightforward as it seems. Running the example provided by @hramezani returns this in v1.9.1:

Traceback (most recent call last):
  File "/Users/ocarmi/Library/Application Support/JetBrains/PyCharm2022.1/scratches/scratch_5.py", line 27, in <module>
    print(e.json())
  File "pydantic/main.py", line 500, in pydantic.main.BaseModel.json
TypeError: Expected unicode, got dict

Adding json.dumps to the return solves this:

from pydantic import BaseModel
from datetime import datetime
import json


def custom_dumps(v, *, default):
    v['datetime_1'] = v['datetime_1'].isoformat()
    v['datetime_2'] = v['datetime_2'].isoweekday()
    return json.dumps(v)


class ExportTest(BaseModel):
    datetime_1: datetime = datetime.utcnow()
    datetime_2: datetime = datetime.utcnow()

    def dict(self, **kwargs):
        data = super().dict()
        data['datetime_1'] = data['datetime_1'].isoformat()
        data['datetime_2'] = data['datetime_2'].isoweekday()
        return data

    class Config:
        json_dumps = custom_dumps


e = ExportTest()
print(e.dict())
print(e.json())

BUT, there is a caveat here. While this example is super simplified, in the real world I have a lot of nested models in my code. Consider this:

from pydantic import BaseModel
from datetime import datetime
import json


def custom_dumps(v, *, default):
    v['datetime_1'] = v['datetime_1'].isoformat()
    v['datetime_2'] = v['datetime_2'].isoweekday()
    return json.dumps(v)


class Internal(BaseModel):
    time: datetime = datetime.utcnow()

    class Config:
        json_encoders = {datetime: str(datetime)}


class ExportTest(BaseModel):
    internal: Internal = Internal()
    datetime_1: datetime = datetime.utcnow()
    datetime_2: datetime = datetime.utcnow()

    def dict(self, **kwargs):
        data = super().dict()
        data['datetime_1'] = data['datetime_1'].isoformat()
        data['datetime_2'] = data['datetime_2'].isoweekday()
        return data

    class Config:
        json_dumps = custom_dumps


e = ExportTest()
print(e.dict())
print(e.json())

This does NOT work as my custom json_dumps method now need to handle all internal serializations:


Traceback (most recent call last):
  File "/Users/ocarmi/Library/Application Support/JetBrains/PyCharm2022.1/scratches/scratch_5.py", line 36, in <module>
    print(e.json())
  File "pydantic/main.py", line 500, in pydantic.main.BaseModel.json
  File "/Users/ocarmi/Library/Application Support/JetBrains/PyCharm2022.1/scratches/scratch_5.py", line 9, in custom_dumps
    return json.dumps(v)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type datetime is not JSON serializable

Even if I add another custom dumps method to the internal class, apparantly it's not called when using the external serializaiton:

from pydantic import BaseModel
from datetime import datetime
import json


def internal_dumps(v, *, default):
    v['time'] = str(v['time'])
    return v


def custom_dumps(v, *, default):
    v['datetime_1'] = v['datetime_1'].isoformat()
    v['datetime_2'] = v['datetime_2'].isoweekday()
    return json.dumps(v)


class Internal(BaseModel):
    time: datetime = datetime.utcnow()

    class Config:
        json_encoders = {datetime: str(datetime)}
        json_dumps = internal_dumps


class ExportTest(BaseModel):
    internal: Internal = Internal()
    datetime_1: datetime = datetime.utcnow()
    datetime_2: datetime = datetime.utcnow()

    def dict(self, **kwargs):
        data = super().dict()
        data['datetime_1'] = data['datetime_1'].isoformat()
        data['datetime_2'] = data['datetime_2'].isoweekday()
        return data

    class Config:
        json_dumps = custom_dumps


e = ExportTest()
print(e.dict())
print(e.json())

This fails exactly the same.

For further comparison, this is the code and output when using pydantic 1.8.2:

from pydantic import BaseModel
from datetime import datetime


class Internal(BaseModel):
    time: datetime = datetime.utcnow()


class ExportTest(BaseModel):
    internal: Internal = Internal()
    datetime_1: datetime = datetime.utcnow()
    datetime_2: datetime = datetime.utcnow()

    def dict(self, **kwargs):
        data = super().dict()
        data['datetime_1'] = data['datetime_1'].isoformat()
        data['datetime_2'] = data['datetime_2'].isoweekday()
        return data


e = ExportTest()
print(e.dict())
print(e.json())
{'internal': {'time': datetime.datetime(2022, 7, 31, 9, 50, 49, 62540)}, 'datetime_1': '2022-07-31T09:50:49.063791', 'datetime_2': 7}
{"internal": {"time": "2022-07-31T09:50:49.062540"}, "datetime_1": "2022-07-31T09:50:49.063791", "datetime_2": 7}

So it seems there isn't an easy workaround for this issue (other than let the external dumps method be in-charge of also handling of all the nested models serializations, which is not ideal to say the least).
With this new information, is it possible to consider this issue again? Would it be possible to return to the old way using a new flag perhaps? (I ask knowing very little on the internals revolving this change)

As it stands this is a blocker for me to upgrade my python version to 3.10+

@samuelcolvin
Copy link
Member

@PrettyWood do you have any suggestions?

I'm really sorry about your pain but I'd rather not add a flag for one release, unless we really have to - the behavior will change completely in V2 (we also need to make sure customisation like this is possible in V2).

@liiight
Copy link
Contributor Author

liiight commented Jul 31, 2022

I understand @samuelcolvin and waiting for V2 is a perfectly reasonable solution for me, thanks!

If I may make a suggestion though, this entire customization was done in order to allow different serialization of the same type for different fields. I'd like to propose an API suggestion for future releases (if relevant):

from pydantic import BaseModel, Field
from datetime import datetime


class ExportTest(BaseModel):
    datetime_1: datetime = Field(default_factory=datetime.utcnow, json_encoder=lambda x: x.isoformat())
    datetime_2: datetime = datetime.utcnow()

    class Config:
        json_encoders = {datetime: lambda x: x.isoweekday()}

Setting the encoder using Field will be allow for this per field customization and will override the encoder set via Config if applicable.

@samuelcolvin
Copy link
Member

I think that makes complete sense.

I haven't started working (or thinking) on how we do exporting/dumping in v2/pydantic-core but imagine it'll be somewhat similar to validation where we have a linked collection of serialisers, a functional serialiser would accomplish exactly what you need.

@liiight
Copy link
Contributor Author

liiight commented Jul 31, 2022

That sound awesome, looking forward to V2.

Would you like to keep this issue open as a reference?

@samuelcolvin samuelcolvin added this to the Version 2 milestone Jul 31, 2022
@samuelcolvin
Copy link
Member

Yes, I think best to keep it open.

@liiight
Copy link
Contributor Author

liiight commented Aug 2, 2022

Ok, so I was able to come up with a pretty silly workaround:

from pydantic import BaseModel, Field
from datetime import datetime
import json


class Internal(BaseModel):
    d: datetime = Field(default_factory=datetime.utcnow)


class External(BaseModel):
    internal: Internal = Internal()
    datetime_1: datetime = datetime.utcnow()
    datetime_2: datetime = datetime.utcnow()

    class Config:
        json_encoders = {datetime: lambda v: v.isoformat()}

    def json(self, **kwargs):
        json_text = super().json(**kwargs)
        data = json.loads(json_text)
        data['datetime_2'] = datetime.fromisoformat(data['datetime_2']).isoweekday()
        return json.dumps(data)


e = External()
print(e.json())
{"internal": {"d": "2022-08-02T10:24:24.132888"}, "datetime_1": "2022-08-02T10:24:24.133470", "datetime_2": 2}

I'll just override the json() method, load the data, load the field i need, convert it to the desired format and dump the data back 😅

Not great but calling json() will correctly serialize nested models and will allow me to just handle the specific model values I want.

@Kludex
Copy link
Member

Kludex commented Apr 27, 2023

Given the work around (👀), and the fact that we are focusing on V2, I'll be closing this issue. 🙏

@Kludex Kludex closed this as not planned Won't fix, can't repro, duplicate, stale Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V1 Bug related to Pydantic V1.X
Projects
None yet
Development

No branches or pull requests

4 participants