Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protocol(s) for attrs classes? #952

Closed
pganssle opened this issue Apr 25, 2022 · 13 comments
Closed

Protocol(s) for attrs classes? #952

pganssle opened this issue Apr 25, 2022 · 13 comments

Comments

@pganssle
Copy link
Member

During a conversation today, it came up that some people might find it useful to have a static type that is equivalent to attrs.has. I noticed, for example, that the signature for attrs.asdict is:

def asdict(
    inst: Any,
    ...
)

But if, for example, we had a protocol like this:

class AttrsClass(typing.Protocol):
    __attrs_attrs__ : Any

It would make it so that many of the attrs-class-specific functions could be statically typed. End-users could create this protocol themselves, but it seems like this might be relying on implementation details, and the attrs functions aren't annotated to require classes match this protocol anyway, so this seems like something that maybe should be exposed directly in attrs.

I'm not sure if there are any finer-grained protocols that it would be worth defining — e.g. are there any attrs classes where asdict or astuple or evolve would fail on? If so, it might make sense to define AsDictable, AsTupleAble, MutableAttrsClass or whatever.

@rickeylev
Copy link

To add a bit of context, this stemmed from code that basically did:

def convert(obj: Any): 
  if <is dict>: <handle dict case>
  elif <is list>: <handle list case>
  ...
  elif <is attr defined class>:
    attrs.asdict(obj)
  else: <error, unsupported object>

While (most of) the other cases can be represented in the type signature (Union[list, dict]), the attr case basically forces using object or Any as the type, effectively preventing static analysis. I think with the typeguard pep, static typing would be possible if attrs provided a "type" that the object was understood as.

@Tinche
Copy link
Member

Tinche commented Apr 25, 2022

I started some work on this already: #890

My plan was to go even further: since Mypy now knows the exact type of __attrs_attrs__ we could also have correct type checking for attrs.fields. Through a contribution to Mypy, I made the __attrs_attrs__ attribute be recognized as a class attribute (which it is) without being aware Mypy doesn't support generic class attributes in protocols (python/mypy#5144). So I essentially screwed up my own plans with my Mypy work, which made me take an extended break from this work ;)

But I do think this would be a good idea, naturally.

@hynek
Copy link
Member

hynek commented Apr 26, 2022

The problem is that __attrs_attrs__ is strictly a private API.

As much as I love them, I don't think that a Protocol is the right fit here, specifically because attrs doesn't provide a public API on the classes on the utmost purpose.


I think what would make more sense here is something like:

class AttrsClass(ABC):
    pass

and then call AttrsClass.register(cls) in ClassBuilder.

Opinions?

@Tinche
Copy link
Member

Tinche commented Apr 26, 2022

Even if you consider __attrs_attrs__ to be a private API (I kind of don't at this point), there's no reason a protocol (defined here) can't make use of that API. The implementation using it is also clean - it's the thing that makes a class an attrs class so a protocol for it makes perfect sense to me.

@hynek Your proposed solution doesn't work for static checking, right?

@hynek
Copy link
Member

hynek commented Apr 26, 2022

I don’t think there’s a solution that does it right statically without writing plugins? Given that I’m afraid that pyright is the future thanks to funding, I’m not sure any other is viable. Unless I misunderstood the problem — I thought they just wanted to be able to determine at runtime via typing that something is an attrs class?

@Tinche
Copy link
Member

Tinche commented Apr 26, 2022

The plugin is already written :p. The protocol from the OP should actually work in Mypy now.

@rickeylev
Copy link

I thought they just wanted to be able to determine at runtime via typing that something is an attrs class?

Not quite. The desire is for static analysis to be able to know a type is an attrs class and address them.

I don’t think there’s a solution that does it right statically without writing plugins?

This is probably mostly true (iiuc, mypy and pytype both have custom code for attrs). But such type-checker specific extensions just allow a specific type checker to track, via it's own implementation-specific mechanisms, that an object is of type "Attrs generated class".

The key part missing is a canonical static-analysis-time symbol that says "this is an attrs generated class", and it'll work best if that symbol name is provided/owned by attrs itself instead of some third party.

Such a symbol allows arbitrary code to say "I accept an attrs generated class" and not have to worry about a specific type checker's way to represent them. Without such a symbol, user code would have to do e.g. Union[pytype_extensions.AttrGenerated, mypy_extensions.AttrGenerated, etc] or some-such.

The problem is that __attrs_attrs__ is strictly a private API.

TBH, I wouldn't worry about that. Type annotations aren't really a great way to enforce visibility controls. Users can just cast, disable type checker warnings, and etc to bypass the "wrong" errors when the type checker says an attribute doesn't exist when it actually does.

An alternative would be to introduce an attribute whose only purpose is to act as a marker. e.g. __attrs_is_attr_generated__: object, and put that on the protocol.

re: class AttrsClass; register(...):

Personally, I don't really like protocols so this more nominal typing approach is more appealing to me. I don't think the register() part is strictly necessary. Just note that it would allow a second way to test for an attrs class in addition to attrs.has(); not sure if that's desirable to you as the api owner. Omitting register() also, effectively, keeps it as a static-analysis-time behavior, too.

TBH, my gut says keeping the static-analysis-support simple like this is prudent -- the type language is capable of expressing the basics of "the class/object looks-like/inherits-from X", but isn't yet capable of expressing the set of transformations attrs does to a class, so trying to fit anything more than "yep, its an attrs generated thing" would be hard and not worth it.

A simple empty marker class like this would probably provide the basics for what's necessary, though. Type checkers can internally tag that @attrs.s class Foo means "Foo and AttrsClass" are compatible.

@rchen152
Copy link

The issue I see with an empty marker class is that every type checker would still have to special-case it, and if we're special-casing attrs classes anyway, why not just support attrs.has()?

I like the idea of using a protocol. If the concern is about __attrs_attrs__ being private, then the attrs library itself could provide the protocol class, and users would never need to know how the protocol is defined under the hood.

@pganssle
Copy link
Member Author

If the concern is about __attrs_attrs__ being private, then the attrs library itself could provide the protocol class, and users would never need to know how the protocol is defined under the hood.

To be clear, this is indeed what I was intending to propose: attrs would provide the protocol with no guarantees about the implementation. The contract would be "attrs.AttrsClass matches any attrs-generated class", not "you can implement an attrs-like class that matches this protocol" or anything else that exposes the implementation details of the protocol. (Though obviously Hyrum's law, users gonna user, etc)

@rickeylev
Copy link

The issue I see with an empty marker class is that every type checker would still have to special-case it, and if we're special-casing attrs classes anyway, why not just support attrs.has()

Having a type checker recognize the type-implications that attrs.has() performs is somewhat a different issue. The goal is make code like this type-safe insofar as static analysis is concerned:

@attrs.s
class Foo: ...

def func(x: Union[dict, <???>]):
  if isinstance(x, dict): ...
  elif <x is an attrs defined class>:
    attrs.asdict(x)
  else: <error>

func(Foo())

A type checker might internally know Foo is attrs-generated, but users have no way of saying that themselves. Because they can't say that, they are forced to, essentially, disable type checking -- they have to resort to putting Any or object, which defeats any more specific types they put in.

A type checker supporting attrs.has() would make the "attrs.asdict()" branch Just Work for users, but it's not a necessity. Users could still manually cast when they know.

To be clear, this is indeed what I was intending to propose: attrs would provide the protocol with no guarantees about the implementation. The contract would be "attrs.AttrsClass matches any attrs-generated class", not "you can implement an attrs-like class that matches this protocol" or anything else that exposes the implementation details of the protocol. (Though obviously Hyrum's law, users gonna user, etc)

I don't see how this is possible using Protocol? Protocol is for structural typing, so there has to be something about the object structure that causes an object to match. Is an empty Protocol special?

@pganssle
Copy link
Member Author

I don't see how this is possible using Protocol? Protocol is for structural typing, so there has to be something about the object structure that causes an object to match. Is an empty Protocol special?

attrs.has() is implemented by checking for the existence of __attrs_attrs__, so the protocol I suggested in the initial post should work now.

I'm suggesting that attrs provide such a protocol to encapsulate this private implementation detail, the same way that attrs.has is a supported public interface for runtime checking.

The main constraint this introduces on attrs going forward is that it means attrs classes would need to be implemented in such a way that they have some features (public or private) that can be expressed as a Protocol.

@hynek
Copy link
Member

hynek commented Apr 30, 2022

Alright, Paul & I spoke at PyCon and I'll try to shepherd #890 into main. Currently some higher CI forces are fighting us.

@hynek
Copy link
Member

hynek commented May 1, 2022

Fixed by #890

@hynek hynek closed this as completed May 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants