Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

structuring with include_subclasses does not work when the union type is a direct field of a dataclass. #430

Open
aha79 opened this issue Sep 22, 2023 · 4 comments
Labels
Milestone

Comments

@aha79
Copy link

aha79 commented Sep 22, 2023

  • cattrs version: 23.1.2
  • Python version: 3.10.7
  • Operating System: Windows

Description

structuring with include_subclasses does not work properly when the union type is a direct field of a dataclass. When it is indirectly used in the type hint of a field it works.

What I Did

@dataclass
class A:
    x: int

@dataclass
class Derived(A):
    d: A

def ustrat(a: Any, conv) -> Any:
    return cattrs.strategies.configure_tagged_union(a, conv, tag_name="type")

converter = cattrs.Converter()
cattrs.strategies.include_subclasses(A, converter, union_strategy=ustrat)

dic = [{'x': 9, 'd': {'x': 99, 'd': {'x': 999, 'type': 'A'}, 'type': 'Derived'}, 'type': 'Derived'}]

data_out = converter.structure(dic, List[A] )

# >> data_out  = [Derived(x=9, d=A(x=99))]       # WRONG we expect  '[Derived(x=9, d=Derived(x=99, d=A(x=999)))]'
# Note: The reverse (i.e. unstructuring) works as expected.

Now if we (incorrectly) change A to Optional[A] in Derived it does indeed work. Using Optional[A] instead of A does not harm serialization, as None does never occur. It interferes with static type checking though so it is not a good workaround.

@dataclass
class Derived(A):
    d: Optional[A]      # correct would be 'A', this is a work around

data_out = converter.structure(dic, List[A] )

# >> data_out  = [Derived(x=9, d=Derived(x=99, d=A(x=999)))]       # RIGHT (but at the cost of wrong  Derived.d)
@Tinche
Copy link
Member

Tinche commented Sep 22, 2023

Interesting. So what I think is happening is the hook for Derived gets created before the include_subclasses hook for A gets installed, hence your behavior.

It's somewhat of a chicken and egg problem. The include_subclasses hook for A needs the hook for Derived to exist, and the hook for Derived needs the include_subclasses hook for A to exist.

I wonder if we can be clever and override hooks for children if their type is the parent type. So the hook for Derived.d would use late binding, i.e. be converter.structure(obj, A). (Note that cattrs doesn't do this by default for performance.)

@Tinche Tinche added the bug label Sep 22, 2023
@anderssonjohan
Copy link

@Tinche I'm not sure if this is the same problem, but it seems to be an issue with the tagged union. It works fine if I don't use that.

Example:

>>> import functools
>>> import attrs
>>> import cattrs
>>> import cattrs.strategies
>>> @attrs.define()
... class Base:
...     pass
...
>>> @attrs.define()
... class Child(Base):
...     field: Base
...
>>> c = cattrs.GenConverter()
>>> cattrs.strategies.include_subclasses(Base, c)
>>> result = c.structure({"field": {"field": {}}}, Base)
>>> result
Child(field=Child(field=Base()))
>>> c = cattrs.GenConverter()
>>> cattrs.strategies.include_subclasses(Base, c, union_strategy=functools.partial(cattrs.strategies.configure_tagged_union,tag_name="exp_type"))
>>> c.unstructure(Child(field=Child(field=Base())))
{'field': {'field': {'exp_type': 'Base'}, 'exp_type': 'Child'}, 'exp_type': 'Child'}
>>> c.structure({'field': {'field': {'exp_type': 'Base'}, 'exp_type': 'Child'}, 'exp_type': 'Child'}, Base)
Child(field=Base()) # Child is lost!

I started using cattrs and suddenly one class didn't work.
It seems the "workaround" making fields optional makes it work, but is not a good solution in the long run.

What I had which worked:

@attrs.define
class Base:
    pass

@attrs.define
class List(Base):
    items: List[Base]

@attrs.define
class Child(Base):
    left: Base
    right: Base

List with Child items works fine, but in Child I have to change type of left and right to Optional[Base] in order to have them structured as subclasses to Base.

An example with the List-class:

>>> c.unstructure(Child(left=List(items=[Base(), Child(left=List(items=[Base()]), right=Base())]),right=Base()))
{'left': {'items': [{'exp_type': 'Base'}, {'left': {'items': [{'exp_type': 'Base'}], 'exp_type': 'List'}, 'right': {'exp_type': 'Base'}, 'exp_type': 'Child'}], 'exp_type': 'List'}, 'right': {'exp_type': 'Base'}, 'exp_type': 'Child'}
>>> d = c.unstructure(Child(left=List(items=[Base(), Child(left=List(items=[Base()]), right=Base())]),right=Base()))
>>> c.structure(d, Base)
Child(left=Base(), right=Base())

Subclasses without tagged union doesn't work for List because it complains about not having any unique fields, which I find strange as well since it clearly has the unique field items: List[Base], but that's another problem.

@anderssonjohan
Copy link

@aha79 @Tinche Here's my ugly workaround which let's me skip the workaround using Optional using the example in my previous comment.

>>> import functools
>>> from typing import Dict, Type
>>> import attrs
>>> import cattrs
>>> import cattrs.gen
>>> import cattrs.strategies
>>> @attrs.define()
... class Base:
...     tag_to_cls: Dict[str, Type] = {}
...     def __init_subclass__(cls, **kwargs):
...         super().__init_subclass__(**kwargs)
...         cls.tag_to_cls.update({cls.__name__: cls})
...
>>> @attrs.define()
... class Child(Base):
...     field: Base
...
>>> c = cattrs.GenConverter()
>>> cattrs.strategies.include_subclasses(Base, c, union_strategy=functools.partial(cattrs.strategies.configure_tagged_union,tag_name="exp_type"))
>>> c.structure({'field': {'field': {'exp_type': 'Base'}, 'exp_type': 'Child'}, 'exp_type': 'Child'}, Base)
Child(field=Base())  # same as in my previous comment, child is lost
>>> def structure_hook(c, o, cl):  # cl = Base
...     cl = Base.tag_to_cls.get(o["exp_type"], cl)  # default to Base
...     struct = cattrs.gen.make_dict_structure_fn(cl, c)
...     return struct(o, cl)
>>> c.register_structure_hook(Base, functools.partial(structure_hook, c))
>>> c.structure({'field': {'field': {'exp_type': 'Base'}, 'exp_type': 'Child'}, 'exp_type': 'Child'}, Base)
Child(field=Child(field=Base()))

@Tinche
Copy link
Member

Tinche commented Oct 17, 2023

@anderssonjohan I spent some time debugging your example (phew, these are kind of hard to debug tbh).

I think it's the same issue. The reason it works if you change the field type to an Optional is because that forces late resolution for the Child.field hook, which is what solves the underlying issue. Definitely need to figure this out for the release after next (the next release is very close and very overdue, so I don't want to bloat it).

@Tinche Tinche added this to the 24.1 milestone Nov 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants