Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pydantic>=2.5 classes can't be serialized #650

Open
isidentical opened this issue Mar 2, 2024 · 4 comments
Open

pydantic>=2.5 classes can't be serialized #650

isidentical opened this issue Mar 2, 2024 · 4 comments

Comments

@isidentical
Copy link

import dill
from pydantic import BaseModel, Field


dill.settings["recurse"] = True


class Input(BaseModel):
    prompt: str = Field(
        ..., title="Prompt", description="The prompt to use for the completion."
    )
    num_inference_steps: int = Field(
        default=25,
        ge=20,
        le=100,
        title="Number of Inference Steps",
        description="The number of inference steps to take for each prompt.",
    )


Input2 = dill.loads(dill.dumps(Input))
print(Input2(prompt="test", num_inference_steps=25))

same example works with cloudpickle

import cloudpickle
from pydantic import BaseModel, Field


class Input(BaseModel):
    prompt: str = Field(
        ..., title="Prompt", description="The prompt to use for the completion."
    )
    num_inference_steps: int = Field(
        default=25,
        ge=20,
        le=100,
        title="Number of Inference Steps",
        description="The number of inference steps to take for each prompt.",
    )


Input2 = cloudpickle.loads(cloudpickle.dumps(Input))
print(Input2(prompt="test", num_inference_steps=25))
@gabrielmbmb
Copy link

gabrielmbmb commented Apr 22, 2024

I also encountered this issue. Not sure if the issue is on dill or pydantic side (even if pydantic.BaseModels can be serialized/deserialized with pickle and cloudpickle).

This the minimum code required to reproduce the error with dill==0.3.8, pydantic==2.7.0 and pydantic_core==2.18.1:

import dill
from pydantic import BaseModel


class MyModel(BaseModel):
    pass


dill.loads(dill.dumps(MyModel()))

and the error:

/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/dill/_dill.py:414: PicklingWarning: Cannot locate reference to <class '__main__.MyModel'>.
  StockPickler.save(self, obj, save_persistent_id)
/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/dill/_dill.py:414: PicklingWarning: Cannot pickle <class '__main__.MyModel'>: __main__.MyModel has recursive self-references that trigger a RecursionError.
  StockPickler.save(self, obj, save_persistent_id)
Traceback (most recent call last):
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/test_step_decorator.py", line 13, in <module>
    dill.loads(dill.dumps(MyModel()))
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/dill/_dill.py", line 303, in loads
    return load(file, ignore, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/dill/_dill.py", line 289, in load
    return Unpickler(file, ignore=ignore, **kwds).load()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/dill/_dill.py", line 444, in load
    obj = StockUnpickler.load(self)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/dill/_dill.py", line 593, in _create_type
    return typeobj(*args)
           ^^^^^^^^^^^^^^
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/pydantic/_internal/_model_construction.py", line 93, in __new__
    private_attributes = inspect_namespace(
                         ^^^^^^^^^^^^^^^^^^
  File "/Users/gabrielmbmb/Source/Argilla/distilabel/.venv/lib/python3.11/site-packages/pydantic/_internal/_model_construction.py", line 406, in inspect_namespace
    raise PydanticUserError(
pydantic.errors.PydanticUserError: A non-annotated attribute was detected: `model_fields = {}`. All model fields require a type annotation; if `model_fields` is not meant to be a field, you may be able to resolve this error by annotating it as a `ClassVar` or updating `model_config['ignored_types']`.

For further information visit https://errors.pydantic.dev/2.7/u/model-field-missing-annotation

This only occurs if the pydantic.BaseModel have been declared in __main__. If it's declared in another module, then everything works.

@mmckerns
Copy link
Member

dill doesn't explicitly support pickling of pydantic classes, but I can help figure out if there's a patch to be applied in dill (due to something in the standard library), or in pydantic, or elsewhere.

If earlier versions of dill, pydantic, etc had serialized a BaseModel instance, then one easy thing to do is to walk back over commits and see which commit corresponds to the change in behavior. Also, dill provides a serialization traceback, that traces the recursive pickling process... so it's helpful to debug a failure to serialize with dill.detect.trace(True). I can help decipher what the trace is telling you.

@gabrielmbmb
Copy link

Thanks for the info @mmckerns. I'll do some more tests and try to figure out what's happening.

@Rocamonde
Copy link

Also facing this issue — we use pydantic for our run configs in ML experiments, and pickling is essential since we do distributed training. Any updates would be super helpful! Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants