Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dill unpickles into a new type for classes defined inside another class #600

Open
NiklasRosenstein opened this issue Jun 19, 2023 · 1 comment

Comments

@NiklasRosenstein
Copy link

Example:

import dill

class A:
    class B:
        pass

    def __init__(self, b) -> None:
        self.b = b

a1 = A(A.B())
a2 = dill.loads(dill.dumps(a1))

assert isinstance(a2, A)
assert isinstance(a2.b, A.B)

Expected behaviour:

Both assertions pass.

Actual behaviour:

The second assertion fails, indicating that the deserialized type of a2.b is not an actual instance of A.B.

NiklasRosenstein added a commit to kraken-build/kraken that referenced this issue Jun 19, 2023
…bal scope, as otherwise we run into an issue with Dill deserialization where the type of `Address.elements` items is not actually the `Address.Element` type that we can access at runtime. (See uqfoundation/dill#600).
@mmckerns
Copy link
Member

It's a matter of whether the class is pointed to by reference, or if it uses the serialized class definition:

Python 3.8.17 (default, Jun 10 2023, 20:56:04) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> 
>>> class A:
...     class B:
...         pass
...     def __init__(self, b) -> None:
...         self.b = b
... 
>>> a1 = A(A.B())
>>> a2 = dill.loads(dill.dumps(a1), ignore=True)
>>> isinstance(a2, A)
False
>>> isinstance(a2.b, A.B)
False
>>> 
>>> a2 = dill.loads(dill.dumps(a1), ignore=False)
>>> isinstance(a2, A)
True
>>> isinstance(a2.b, A.B)
False
>>>
>>> a2 = dill.loads(dill.dumps(a1, byref=True), ignore=False)
>>> isinstance(a2, A)
True
>>> isinstance(a2.b, A.B)
True

So, the first two store the class definition for A, but the first ignores the stored "A" when an "A" exists in the given namespace. The last one doesn't store "A", so is forced to point to "A" by reference. It doesn't perform the same for the nested class "B", because there's no "B" defined in __main__. The behavior of "A" was done intentionally, so there were options to work around a changed class definition for "A", if needed -- the default being to assume the "A" you want is the "A" that was stored upon dump. I don't think we extended the same rule of "check in __main__ depending on ignore" to classes defined within classes. I assume you'd like the ignore keyword to be extended to this case...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants