Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ForwardRef #215

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

aha79
Copy link

@aha79 aha79 commented Jan 27, 2022

This adds ForwardRef support to cattr (See #201).

  • ForwardRefs in classes

    @dataclass
    class Test:
        x: List["XType"]
    
    XType = int
    
  • Recursive classes

    @dataclass
    class Node:
        data: Dict
        left: Optional["Node"]
        right: Optional["Node"]
    
  • Recursive type aliases. (Note: Even with PEP 563 we need ForwardRefs here)

    # Tree-like structure
    Node = Tuple[ Dict, Optional["Node"], Optional["Node"] ]
    
    class UserClass:
        some_tree_data: Node
    
  • Recursive type aliases, across modules. (Needs explicit ForwardRef resolution, though. This is a Python limitation)

    # module.py
    Node = Tuple[ Dict, Optional["Node"], Optional["Node"] ]
    cattr.resolve_types(Node, globals(), locals())
    
    # main.py
    from module import Node
    class UserClass:
        some_tree_data: Node
    
  • Manual registration (this worked before, but is still supported.) This PR also fixes TypeError when registering ForwardRefs in python 3.10.2 #206)

    Node = Tuple[ Dict, Optional["Node"], Optional["Node"] ]
    
    c.register_structure_hook(
        ForwardRef("Node"), lambda obj, _: c.structure(obj, Node)
    )
    c.register_unstructure_hook(
        ForwardRef("Node"), lambda obj: c.unstructure(obj, Node)
    )
    

Specifically, the PR does

  • Add hooks for ForwardRef (this equires ForwardRefs to be in resolved state, otherwise an error is generated)

  • Ensure that ForwardRef are resolved for registered types, whenever it is possible to resolve them without context information. (Essentially this is done by a call get_type_hints on the type containing the ForwardRef).

    For this a new function resolve_types was added which is systematically used. It basically does the same thing as attrs.resolve_types but works for all types (attrs classes, dataclasses and type aliases).

  • ForwardRefs are now registered with direct dispatch and no longer singledispatch. This is to work around TypeError when registering ForwardRefs in python 3.10.2 #206.

These test all kinds of subtleties of ForwardRefs and their
(automatic, semiautomatic or manual) resolution. As dataclasses, attrs
classes and type aliases behave slightly different and, in addition to that,
behave differently with respect to PEP 563 (the delayed evaluation of type hints)
the tests are quite extensive. A critical test-case is also whether
resolution works across modules, i.e. whether ForwardRefs are handled correctly
when they are imported from another module.
Correct operation in any of the above cases cannot be taken for granted
@Tinche
Copy link
Member

Tinche commented Jan 27, 2022

Thanks for putting in work on this.

The hooks are obviously fine. The issue is that this introduces resolve_types for dataclasses, and tbh that's out of scope for cattrs. That should be implemented in Python itself or another library that's willing to take on the maintenance burden of it.

@aha79
Copy link
Author

aha79 commented Jan 31, 2022

Hmm, but what is the alternative? Not supporting dataclasses would not be very useful - at least not for me.
Hooks without any type resolving before will just generate an error message.

In fact, you already do resolve dataclasses in cattr. It is the get_type_hints call inside adapted_fields. Even though this is enough to handle PEP563, one needs more symmetric handling of attrs and dataclasses to get all ForwardRef tests to pass. That is why I introduced resolve_types, in the same spirit as has and fieldsin _compat.py.

One could simplify the new resolve_types by removing the part under if not is_py39_plus:, though. This would not really hurt as it only affects older Python versions (and only if PEP 563 is enabled).

@aha79
Copy link
Author

aha79 commented Jan 31, 2022

I understand your concern regarding mantainability. However I would argue having a more symmetric way of handling dataclasses and attrs classes is cleaner and needs less maintenance effort.

For instance: One can get rid of get_type_hints call in adapted_fields (which probably should not really be in there) and the calls to attrs.resolve_types scattered around cattrs (and some more clean ups).
But that would be a refactoring on its own and I wanted to keep the list of changes for this PR small.

@aha79
Copy link
Author

aha79 commented Jan 31, 2022

Could it also be that in a428272 cattr/gen.py and cattrs/gen.py are accentially swapped?

@Tinche
Copy link
Member

Tinche commented Jan 31, 2022

You make a couple of good points, so let's discuss further.

(The cattr -> cattrs move isn't accidental, it's on purpose. I think all new code will be going into the cattrs namespace, and existing code will be migrating there over time. The old packages will still have aliases so no backwards compatibility break.)

The resolve_types as you've implemented it overwrites the type field for dataclasses. I have no idea if this is a supported use case, and that's why cattrs is careful to never do it. Hence the adapted fields approach. If in the next version Python decides to make this impossible by making this field immutable or something, I don't want this to be a cattrs problem.

On a higher level, here's the philosophy of cattrs: attrs is simply a better library than dataclasses, and one of its advantages is that the maintainers of attrs (I'm one of them) choose to implement and maintain resolve_types. The maintainers of dataclasses choose not to - instead the interface provided is get_type_hints, and the users get to deal with that. The answer here isn't cattrs patching up this missing support in dataclasses, it's the users of dataclasses (you) switching to attrs or putting pressure on the maintainers of dataclasses to implement this.

Now that you see where I'm coming from, you say you're forced to use dataclasses and I'm sympathetic to this use case. So let's discuss to see what we can accomplish without cattrs implementing resolve_types for dataclasses in this form.

One of the options is you implementing resolve_types for dataclasses in your code base. So you get to maintain it instead ;) Would this work for you? Another option would be improving some of the cattrs APIs to be more customizable, which would be useful for other things too.

Also, didn't you mention ForwardRefs can resolve themselves in newer Python versions? Presumably via get_typing_hints?

@aha79
Copy link
Author

aha79 commented Feb 10, 2022

The primary reason to implement resolve_types for all types is that it is needed inside cattr for ForwardRef resolution and not to provide it to the user (that would be just a side effect).

Perhaps let me first expand on what functionality is needed:

  • You are right that Python (only) provides the get_type_hints function for type resolution. In fact this is all we need. The thing that makes it complicated is that we cannot just call get_type_hints on the ForwardRef and expect it to work.

  • Python/get_type_hints, under the hood, does some complicated things to find out the target module of the ForwardRef. Some types know their module and get_type_hints will resolve the ForwardRefs in child annotations.
    For instance get_type_hints called on a dataclass with a field of type Optional['X'] will resolve the 'X' in the same module scope as the dataclass was defined, no matter where the call to get_type_hints is made.

  • Types that know their own module are: ForwardRefs with module field set (only used in special cases), classes (including attrs classes & dataclasses), NewType, TypedDict, NamedTuple, and perhaps more I overlooked [1].

  • So to resolve ForwardRefs we need to call get_type_hints on some parent type of the ForwardRef. When this parent type knows its module get_type_hints will resolve the child ForwardRefs.

  • Concluding, since it is beyond the scope of a library such as cattr to know which typing.py types carry module scope and which not, the simplest solution is to just call get_type_hints on every type that is supposed to get (un)structured. That makes sure that all child-ForwardRefs get resolved when there is opportunity for them to get resolved.

The original idea was to concentrate all get_type_hints business logic calls into one function which is called in the hook factory and takes care of resolving ForwardRefs and (the closely related) PEP563 delayed annotations.

So now let me try to address your specific concerns: I see several options:

  1. Remove overwriting of type fields of dataclasses in resolve_types. You make a valid point of not overwriting the type field of dataclass Field. This is not good. Removing it will just introduce failing tests with Converter. But GenConverter is still fine. If ForwardRefs would only be supported with GenConverter this is ok for me.
  2. Make resolve_types return a value. After get_type_hints do not write to type field, but convert dataclasses to attrs classes. This is essentially the functionality that is done in adapted-fields, and not only return a list of Attribute but put these in a attrs class and return that. (This and corresponding clean-ups could also be done after 1 and we could would not strictly need that)
  3. The resolve_types does not need to be exported from cattr.
  4. resolve_types could receive a better name

So if we do 1 & 3 and possibly 4, the resolve_types would have the dataclasses handling removed and not be exported. It would not generate an error when called for datacasses and work for type aliases as well.

[1]: Even though these types are not (yet) supported by cattrs, the user might still register a hook for them and expect child ForwardRefs to work.

@aha79
Copy link
Author

aha79 commented Feb 10, 2022

Another option would be improving some of the cattrs APIs to be more customizable, which would be useful for other things too.

Could you elaborate on that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TypeError when registering ForwardRefs in python 3.10.2
3 participants