Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Protocols #28

Closed
ruancomelli opened this issue Aug 24, 2021 · 16 comments · Fixed by #73
Closed

Support for Protocols #28

ruancomelli opened this issue Aug 24, 2021 · 16 comments · Fixed by #73

Comments

@ruancomelli
Copy link
Contributor

ruancomelli commented Aug 24, 2021

Protocols were introduced in Python 3.8 by PEP 544, and it would be awesome if Plum could handle them.

A simple use case would look like this:

from typing import Any, Protocol
from plum import dispatch

class Named(Protocol):
    name: str

class Person:
    def __init__(self, name: str, age: int) -> None:
        self.name: str = name
        self.age: int = age

@dispatch
def greet(anyone: Any) -> None:
    print('Hi!')

@greet.dispatch
def _greet_named(named: Named) -> None:
    print(f'Hello, {named.name}')

me = Person('Ruan', 24)
greet(me)
# should print "Hello, Ruan"
# currently throws the following error:
# TypeError: Instance and class checks can only be used with @runtime_checkable protocols

Another use case is what I am actually interested in: using dataclasses. Dataclasses can be identified by a special attribute: __dataclass_fields__ (see python/mypy#8578), so that I could write

class Dataclass(Protocol):
    __dataclass_fields__: Dict[str, Any]

@dispatch
def do_something(dataclass_instance: Dataclass):
    ...

I have no idea how to proceed to implement this, but I'm interested in seeing this in Plum and would love to help in any way I can.
Just for the record, Dry Python Classes has this feature, we could take some inspiration from them.

@ruancomelli
Copy link
Contributor Author

I forgot to mention, but there is also a decorator typing.runtime_checkable. It does not work in this context though:

from typing import runtime_checkable

@runtime_checkable
class Named(Protocol):
    name: str

This snippet causes greet(me) to fail with a TypeError: Protocols with non-method members don't support issubclass().

@PhilipVinc

This comment has been minimized.

@PhilipVinc
Copy link
Collaborator

EDIT: my comment above is wrong.

There are a few problems with supporting Protocols :

  • They seem to be static-only unless they are explicitly marked as @runtime_checkable.
  • @runtime_checkable does not support is subclass, which is what is being used to check types internally.

A possibility would be to partially re-implement Protocols internally in Plum (pretty much like a few abstract types are implemented now) but that seems like an expensive solution...

@wesselb
Copy link
Member

wesselb commented Aug 25, 2021

Ah, support for Protocols would be super cool...

A possibility would be to partially re-implement Protocols internally in Plum (pretty much like a few abstract types are implemented now) but that seems like an expensive solution...

This is also what I'm thinking would be the way forward, which is not impossible but potentially a quite sizeable chunk of work. If you want to have a go at it (and please feel free to!), the implementation can perhaps be similar to those for typing.Sequence and typing.Iterable, which avoids the use of type_of at runtime.

What perhaps should happen first is a slight rework of plum/parametric.py, because that file and the associated tests/test_parametric.py currently contains some suboptimals patterns and consequently a lot of (nearly) duplicate code.

I might not have the opportunity to sort that out and look at support for Protocols in the next two weeks, but perhaps I can look at it after that.

@PhilipVinc
Copy link
Collaborator

@wesselb I think one major source of complication is that all internal plum types have a meta-class TypeMeta that is (As far as I understand) only implementing only the [..] getitem logic).
I think. we could partially simplify the logic and maybe reduce code duplication by getting rid of this (all?) metaclasses.

For starters, we could get rid of TypeMeta by replacing it with a common base class (not metaclass) which implements __class_getitem__ which should have the same result.

@wesselb
Copy link
Member

wesselb commented Aug 25, 2021

@PhilipVinc I was not aware of __class_getitem__. Your proposal sounds like a good change.

One other aspect which is not entirely satisfactory is that Plum types are objects rather than types. For example, typing.Union[plum.Union[int]] doesn't work: TypeError: Union[arg, ...]: each arg must be a type. Got builtins.int., and def f(x: type) does not catch Plum types. One idea I was thinking about was to implement all Plum types with @parametric, so that all Plum types are actually types rather than objects. One issue with this change is that some Plum types load lazily (e.g. forward references; or when referencing types which are in a not-yet-imported packags, which resolve once the package is actually imported), which means that the hashing of type parameters can change, so you have to be careful with caching the @parametric subclasses with a dictionary.

@PhilipVinc
Copy link
Collaborator

One other aspect which is not entirely satisfactory is that Plum types are objects rather than types.

Yeah, It's very confusing even for me...

We should brainstorm a bit about possible solutions to this.
For example, why does plum need to wrap in a Type every type?
can't it work with standard types internally? this would simplify things a lot...

some Plum types load lazily

Is this feature really needed?
I know you use it in your multi-backend backage but...
why is this needed and why can't it be solved with some base abstract types?
Is it because this allows you to run your code before, say numpy is loaded and already reference numpy.ndarray? Couldn't this be solved by importing in the right order/something a la Requires.jl instead of having this quite-complicated code?

@wesselb
Copy link
Member

wesselb commented Aug 29, 2021

For example, why does plum need to wrap in a Type every type?
can't it work with standard types internally? this would simplify things a lot...

I fully agree that it would be much nicer to just work with normal types and avoid wrapping everything.

Is it because this allows you to run your code before, say numpy is loaded and already reference numpy.ndarray?

Yeah, that's right. For example, I might want to write a package which provides different implementations for JAX, TensorFlow, and PyTorch:

import lab as B
from plum import dispatch

@dispatch
def my_fun(x: B.JAXNumeric):
    # Do something


@dispatch
def my_fun(x: B.TFNumeric):
    # Do something slight different


@dispatch
def my_fun(x: B.TorchNumeric):
    # Do something in another way

These functions would depend on the tensor types of JAX, TensorFlow, and PyTorch. However, to actually depend on those tensor types (in the sense of having a reference to them), you would have to import all three packages (and hence have them installed), which isn't very lightweight. Therefore, it would be nice if you could write the above code where B.JAXNumeric et cetera automatically resolves to a JAX tensor type once jax is imported.

I agree that this is a pretty niche use case, though.

A more common and unfortunately more complicated use case is that of forward references:

from plum import dispatch

class A:
    @dispatch
    def do(self, a: "A"):
        pass

Here A.do gets the string "A" which is supposed to refer to A, but this referencs can only be resolved once the class A exists, which is not yet the case when A.do is defined. The way this is currently implemented is with a lazily loading type.

Nevertheless, I think these two features can possibly be implemented with just parametric types, which sounds much more appealing.

@ruancomelli
Copy link
Contributor Author

Thank you both for your replies!

I would love to contribute in any way I can, I'm just really busy until next week. Nevertheless, I am interested in helping here.


For example, why does plum need to wrap in a Type every type?
can't it work with standard types internally? this would simplify things a lot...

I fully agree that re-implementing every typing type doesn't seem ideal.

For starters, we could get rid of TypeMeta by replacing it with a common base class (not metaclass) which implements class_getitem which should have the same result.

This looks like a good place to start.

Nevertheless, I think these two features can possibly be implemented with just parametric types, which sounds much more appealing.

@wesselb what would a solution like this look like? Do you mean something like PlumUnion = parametric(typing.Union)?


One thing that is puzzling me here is that, just for fun, I erased the function plum.function.append_default_args, and, to my surprise, all tests passed! Is this expected? I ask this because, if I'm really going to toy around with the code, then breaking changes should break the tests. The fact that all tests passed means that this function could be removed from the source code, is this correct? If not, could you please take a look at the tests and make sure they cover all intended usage cases?


Anyway, if you agree that perhaps some refactoring in the base code is advisable before supporting protocols, I am more than happy to help in this. Do you have any advice regarding where to start?

@wesselb
Copy link
Member

wesselb commented Sep 9, 2021

I would love to contribute in any way I can, I'm just really busy until next week.

That's no problem at all. I'm also fairly busy with study things, so I'm also only working on this intermittently. :) You offering your help is already very much appreciated.

I fully agree that re-implementing every typing type doesn't seem ideal.

Although this isn't ideal, we might have to reimplement typing types which don't implement the required behaviour. E.g., issubclass(typing.List[int], typing.List[numbers.Number]) does not work and is something we will have to implement.

The current ugliness is that, in addition to the reimplementation of some typing types, all types are also wrapped in Plum's type.Type, which, as @PhilipVinc points out and as I believe, is not necessary at all. I think the wrapped can be removed, but, before that, we should make sure that type.AbstractType and its subclasses don't provide any additional behaviour that the rest of the code relies upon.

@wesselb what would a solution like this look like? Do you mean something like PlumUnion = parametric(typing.Union)?

Suppose that we would really have to reimplement typing.Union. Internally, I would love to do this using @parametric:

from plum import parametric

@parametric
class MyUnion:

    @classmethod
    def __infer_type_parameter__(cls, *types):
        return types

Whereas types with arguments (like a union with types) are currently represented as objects with those arguments stored as attributed, in the above pattern types would be represented as actual types (issubclass(type(MyUnion[int, list]), type) == True) with arguments stored in the type parameter. This is desirable, because, e.g., typing types only take types as arguments, not objects, which means that, currently, Plum types cannot be fed to typing types.

The main reason why we cannot just implement all types in the above way is that, in the above pattern, the arguments to a type must be hashable at type construction time. Crucially, more complicated types such as forward references to classes or types which load once a package is imported may not yet be hashable when they are fed as arguments. One way around this is to assign a temporary, unique hash value to types which are not yet hashable and (1) change the hash when the type becomes hashable and (2) reset all caches which rely on that hash. I think this will work.

One thing that is puzzling me here is that, just for fun, I erased the function plum.function.append_default_args, and, to my surprise, all tests passed!

Could it be that function.py was compiled with Cython as still available as plum/function.cpython-3Xm-darwin.so? If that is the case, then the .so will be used rather than the .py file. Removing that function should already break Plum when the package is imported.

Anyway, if you agree that perhaps some refactoring in the base code is advisable before supporting protocols, I am more than happy to help in this. Do you have any advice regarding where to start?

I'm slightly leaning towards that, before implementing support for Protocol, the wrapping of types should be removed to ensure that all types are really types, and I'm leaning towards the above solution with @parametric. What are your thoughts about this approach? If we do decide to move forward with this change, then, I think, the changes might be substantial but shouldn't pose big troubles (hopefully). It might be hard to execute these changes without intricate familiarity with the codebase, however. I'm thinking that it might be easiest for me to reserve a weekend day to just get this done and present it as a PR to you guys. Unfortunately, at the moment, I happen to be in time-constrained period due to PhD work.

@PhilipVinc
Copy link
Collaborator

By the way, I'm not sure the Union presented above makes much sense.
Infer type parameters is for when you are trying to contrucy an instance of the type, but here you are trying to construct the type itself...

Maybe this is something we should keep in mind.
Union/UnionAll are types that cannot be instantiated

@wesselb
Copy link
Member

wesselb commented Sep 10, 2021

By the way, I'm not sure the Union presented above makes much sense.

Hah, yeah, you're totally right. Doesn't make much sense indeed. I guess I intended to use it as a sort of constructor for the type, which we currrently don't have. If a constructor would be necessary, maybe it isn't, one option would be to add another magic method:

from plum import parametric

@parametric
class MyUnion:

    @classmethod
    def __type_parameter__(cls, *types):
        # Validate, parse, or process `types` here...
        return types

I guess it could be useful in general to have a constructor which can validate the type parameters that you pass to a type. E.g., this would be possible:

from plum import parametric, dispatch

@parametric
class NTuple:

    @dispatch
    @classmethod
    def __type_parameter__(cls, el_type: type, number: int):
        return (el_type, number)

    @dispatch
    @classmethod
    def __type_parameter__(cls, *types: type):
        return cls.__type_parameter__(Union[types], len(types))

To prevent types from being instantiated, one way is to raise an exception in the constructor. Perhaps something like the below could be what we might want?

from plum import parametric, dispatch

@parametric
class MyUnion:
    def __init__(self):
        raise TypeError("Unions cannot be instantiated.")

    @dispatch
    @classmethod
    def __type_parameter__(cls, *types: type):
        return set(types)

@ruancomelli
Copy link
Contributor Author

Although this isn't ideal, we might have to reimplement typing types which don't implement the required behaviour. E.g., issubclass(typing.List[int], typing.List[numbers.Number]) does not work and is something we will have to implement.

Instead of implementing custom types so that issubclass(plum.List[int], plum.List[numbers.Number]) just works, what about replacing issubclass wherever necessary with a custom function?

I believe that one problem here is that a subclass is a class that derives from another one. Thus, in my opinion, it is conceptually wrong to expect int to be a subclass of Union[int, str] – as in issubclass(int, Union[int, str]) – because int does not inherit from Union[int, str]. So perhaps we are looking for a function other than issubclass.

There are some libs out there (like typing_utils) that implement functions for testing with typing types:

import typing
from typing_utils import issubtype

assert issubtype(typing.List[typing.List], typing.List[typing.Sequence]) == True

I don't know how well they work with forward references, but perhaps they're worth experimenting?

I'm slightly leaning towards that, before implementing support for Protocol, the wrapping of types should be removed to ensure that all types are really types, and I'm leaning towards the above solution with @Parametric. What are your thoughts about this approach?

If wrapping types is the way to go, then I find @parametric types a great solution. Your NTuple example looks gorgeous 😁

@wesselb
Copy link
Member

wesselb commented Sep 17, 2021

(...) I believe that one problem here is that a subclass is a class that derives from another one. (...)

That's a very fair point to make. My take on this is slightly different: I think of the collection as all types simply as an set with a total order implemented by issubclass, and you're free to add elements to this collection (like typing.List[int]) as long as issubclass still implements a total order. In this perspective, inheritance is just one syntax to define that order.

There are some libs out there (like typing_utils) that implement functions for testing with typing types:

Ah, that's very nice! Thanks for mentioning this—I'll definitely look into it.

If wrapping types is the way to go, then I find @Parametric types a great solution. Your NTuple example looks gorgeous 😁

Yay! At the moment, I'm fighting against some deadlines, but, after those have passed, I'll try to put together a proposal of the proposed refactoring.

@mofeing
Copy link

mofeing commented Feb 10, 2022

any news on this? any way i can help?

@wesselb
Copy link
Member

wesselb commented Feb 10, 2022

@mofeing Unfortunately on my side there has been no progress for the support of Protocol. A contribution adding support is definitely welcome! The issue which I mean to resolve first is the internal wrapping of types, but, unfortunately, I've also not had the time to work on that.

@wesselb wesselb mentioned this issue Oct 23, 2022
@wesselb wesselb mentioned this issue Feb 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants