Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PicklingError when using some type-annotated function #1528

Closed
elazarcoh opened this issue Nov 28, 2023 · 3 comments
Closed

PicklingError when using some type-annotated function #1528

elazarcoh opened this issue Nov 28, 2023 · 3 comments

Comments

@elazarcoh
Copy link

elazarcoh commented Nov 28, 2023

I encountered a bug(?) when using annotated functions with pydantic class.
Generally, passing the pydantic object works flawlessly. The issue is with function that uses it as a type-hint.
I don't know the internals of joblib or pydantic, so I don't know what might be the reason, or how to reproduce it without pydantic (but I think it relevant other cases as well, not just pydantic classes).
Basically, my workaround is to wrap the type-hint in a quotes, but I think that it's something that can be fixed in the library.

EDIT:
The issue is worse. I can't construct a Foo object in a worker.
I tried to set_loky_pickler("pickle") but it doesn't help.

Minimal reproduce

import pickle
from pickle import PicklingError
from joblib import Parallel, delayed
from pydantic import BaseModel


class Foo(BaseModel):
    value: int


def ok(x: "Foo") -> "Foo":
    return x
def crash(x: Foo) -> Foo:
    return x

# Those work. I don't if it relevant
pickle.dumps(ok)
pickle.dumps(crash)
pickle.dumps(Foo)

# When calling the function with the type-hint as a string, it works
Parallel(n_jobs=2)(delayed(ok)(i) for i in range(10))

# When calling the function with the type-hint as a type, it crashes with PicklingError
try:
    Parallel(n_jobs=2)(delayed(crash)(i) for i in range(10)) 
except PicklingError as e:
    print(e)
@lesteve
Copy link
Member

lesteve commented Dec 5, 2023

I am reasonably confident this is a cloudpickle issue, probably something similar to cloudpipe/cloudpickle#408. Although it was closed I believe the issue still happens in some cases.

A workaround seems to define the pydantic model in a different file, see cloudpipe/cloudpickle#408 (comment).

For good measure:

  • can you edit your message to show the stack-trace you are getting?
  • can you mention the pydantic version you are using, quickly looking at it, it seems like pydantic<2 and pydantic>=2 have slightly different behaviours
  • can you mention whether you are running this as a script python test.py or in an interactive console (e.g. IPython or Jupyter notebook)? Again it seems like the behaviour is somehow different between the two in my local tests

@lesteve
Copy link
Member

lesteve commented Dec 5, 2023

Looking a bit more it looks like when running inside Python console, the issue has been fixed in pydantic 2.5 pydantic/pydantic#6763 pydantic/pydantic#7876 but there is still an issue when running a similar code inside IPython or Jupyter interface pydantic/pydantic#8232

@lesteve
Copy link
Member

lesteve commented Dec 5, 2023

I am going to close this one because I think the issue is not in joblib but in cloudpickle or pydantic.

@lesteve lesteve closed this as completed Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants