Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add types.copy_class() which updates closures #91299

Closed
vstinner opened this issue Mar 28, 2022 · 21 comments
Closed

Add types.copy_class() which updates closures #91299

vstinner opened this issue Mar 28, 2022 · 21 comments
Labels
3.11 only security fixes stdlib Python modules in the Lib dir

Comments

@vstinner
Copy link
Member

BPO 47143
Nosy @rhettinger, @pitrou, @vstinner, @ericvsmith, @encukou, @serhiy-storchaka, @JelleZijlstra, @frenzymadness, @corona10

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2022-03-28.13:51:37.662>
labels = ['library', '3.11']
title = 'Add types.copy_class() which updates closures'
updated_at = <Date 2022-04-02.12:08:35.044>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2022-04-02.12:08:35.044>
actor = 'vstinner'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2022-03-28.13:51:37.662>
creator = 'vstinner'
dependencies = []
files = []
hgrepos = []
issue_num = 47143
keywords = []
message_count = 20.0
messages = ['416168', '416175', '416176', '416177', '416178', '416179', '416180', '416183', '416184', '416186', '416189', '416190', '416191', '416221', '416227', '416230', '416231', '416232', '416544', '416549']
nosy_count = 9.0
nosy_names = ['rhettinger', 'pitrou', 'vstinner', 'eric.smith', 'petr.viktorin', 'serhiy.storchaka', 'JelleZijlstra', 'frenzy', 'corona10']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = None
url = 'https://bugs.python.org/issue47143'
versions = ['Python 3.11']

@vstinner
Copy link
Member Author

vstinner commented Mar 28, 2022

Class decorarators of attrs and stdlib dataclasses modules have to copy a class to *add* slots:

In the common case, copying a class is trivial:

cls2 = type(cls)(cls.__name__, cls.__bases__, cls.__dict__)

Full dummy example just to change a class name without touching the original class (create a copy with a different name):

class MyClass:
    def hello(self):
        print("Hello", self.__class__)

def copy_class(cls, new_name):
    cls_dict = cls.__dict__.copy()
    # hack the dict to modify the class copy
    return type(cls)(new_name, cls.__bases__, cls_dict)

MyClass2 = copy_class(MyClass, "MyClass2")
MyClass2().hello()

Output:

Hello <class '__main__.MyClass2'>

The problem is when a class uses a closure (__class__ here):

class MyClass:
    def who_am_i(self):
        cls = __class__
        print(cls)
        if cls is not self.__class__:
            raise Exception(f"closure lies: __class__={cls} {self.__class__=}")

def copy_class(cls, new_name):
    cls_dict = cls.__dict__.copy()
    # hack the dict to modify the class copy
    return type(cls)(new_name, cls.__bases__, cls_dict)


MyClass().who_am_i()
MyClass2 = copy_class(MyClass, "MyClass2")
MyClass2().who_am_i()

Output:

<class '__main__.MyClass'>
<class '__main__.MyClass'>
Traceback (most recent call last):
  ...
Exception: closure lies: __class__=<class '__main__.MyClass'> self.__class__=<class '__main__.MyClass2'>

The attrs project uses the following complicated code to workaround this issue (to "update closures"):

        # The following is a fix for
        # <https://github.com/python-attrs/attrs/issues/102>.  On Python 3,
        # if a method mentions `__class__` or uses the no-arg super(), the
        # compiler will bake a reference to the class in the method itself
        # as `method.__closure__`.  Since we replace the class with a
        # clone, we rewrite these references so it keeps working.
        for item in cls.__dict__.values():
            if isinstance(item, (classmethod, staticmethod)):
                # Class- and staticmethods hide their functions inside.
                # These might need to be rewritten as well.
                closure_cells = getattr(item.__func__, "__closure__", None)
            elif isinstance(item, property):
                # Workaround for property `super()` shortcut (PY3-only).
                # There is no universal way for other descriptors.
                closure_cells = getattr(item.fget, "__closure__", None)
            else:
                closure_cells = getattr(item, "__closure__", None)

            if not closure_cells:  # Catch None or the empty list.
                continue
            for cell in closure_cells:
                try:
                    match = cell.cell_contents is self._cls
                except ValueError:  # ValueError: Cell is empty
                    pass
                else:
                    if match:
                        set_closure_cell(cell, cls)

source: attr/_make.py

The implementation of the set_closure_cell() function is really complicate since cells were mutable before Python 3.10: attr/_compat.py.

I propose to add a new functools.copy_class() function which copy a class and update the closures: end of the _create_slots_class() function:

cls = type(self._cls)(...)
for item in cls.__dict__.values():
    ... # update closures
return cls

The alternative is not to add a function to copy a class, just only to "update closures", but IMO such API would be more error prone.

I would like to implement this function, but first I would like to discuss if it makes sense to add such function and check if it's the right abstraction.

@vstinner vstinner added 3.11 only security fixes stdlib Python modules in the Lib dir labels Mar 28, 2022
@vstinner
Copy link
Member Author

It seems like the copy module doesn't support copying a class. copy.deepcopy(cls) doesn't copy a class but returns its argument, the class unchanged.

@vstinner
Copy link
Member Author

vstinner commented Mar 28, 2022

The pickle module doesn't copy a type but gets it from its module. The Python implementation is pickle._Pickler.save_type() which calls pickle._Pickler.save_global().

The cloudpickle module doesn't copy types neither: same behavior than pickle.

Example:

import pickle
import pickletools

class A:
    pass

data = pickle.dumps(A)
pickletools.dis(data)

Output:

    0: \\x80 PROTO      4
    2: \\x95 FRAME      18
   11: \\x8c SHORT_BINUNICODE '\_\_main__'
   21: \\x94 MEMOIZE    (as 0)
   22: \\x8c SHORT_BINUNICODE 'A'
   25: \\x94 MEMOIZE    (as 1)
   26: \\x93 STACK_GLOBAL
   27: \\x94 MEMOIZE    (as 2)
   28: .    STOP
highest protocol among opcodes = 4

In short, it's implemented as:

    getattr(__import__('__main__'), 'A')

@vstinner
Copy link
Member Author

pickle.dump(x) checks if x is a type since commit f048a8f (March 2002) of bpo-494904:

Pickler.save(): Fix for SF bug bpo-494904: Cannot pickle a class with a
metaclass, reported by Dan Parisien.

+ if issubclass(t, TypeType):
+ self.save_global(object)
+ return

Followed by a minor fix: commit 85ee491 of bpo-502085:

Don't die when issubclass(t, TypeType) fails.
-            if issubclass(t, TypeType):
+            try:
+                issc = issubclass(t, TypeType)
+            except TypeError: # t is not a class
+                issc = 0

copy.deepcopy(x) returns x if it's a type since commit 11ade1d (June 2002) of bpo-560794.

SF patch 560794 (Greg Chapman): deepcopy can't handle custom
metaclasses.
         try:
-            copier = x.__deepcopy__
-        except AttributeError:
+            issc = issubclass(type(x), type)
+        except TypeError:
+            issc = 0
+        if issc:
+            y = _deepcopy_dispatch[type](x, memo)
+        else:
             (...)

@vstinner
Copy link
Member Author

vstinner commented Mar 28, 2022

More recent copy.copy() change: commit 5c1c3b4 of bpo-11480:

    Issue bpo-11480: Fixed copy.copy to work with classes with custom metaclasses.

+    try:
+        issc = issubclass(cls, type)
+    except TypeError: # cls is not a class
+        issc = False
+    if issc:
+        # treat it as a regular class:
+        return _copy_immutable(x)

@vstinner
Copy link
Member Author

If I understand correctly, a cell content can be modified since Python 3.7: since commit 64505a1 of bpo-30486:

bpo-30486: Allow setting cell value (bpo-1840)

Antoine Pitrou created bpo-30486 for cloudpickle:

"There are use cases for setting a cell value. One such use case is for (un)pickling recursive closures (see heroic workaround here: https://github.com/cloudpipe/cloudpickle/pull/90/files#diff-d2a3618afedd4e124c532151eedbae09R74 ). Other use cases may include tinkering around and general education value."

@vstinner
Copy link
Member Author

In the Python C API, PEP-384 added PyType_FromSpec(). There is also PyStructSequence_NewType(). PEP-3121 proposed PyType_Copy() but it was never implemented: see bpo-3760. But in C, closures are implemented using a module state, or previously using a global or static variable: cell objects are not used for types implemented in C.

@vstinner
Copy link
Member Author

vstinner commented Mar 28, 2022

The same problem exists at the function level: see bpo-39805: "Copying functions doesn't actually copy them".

For example, copy.deepcopy(func) returns func unchanged if it's a function.

Example:

import copy

def make_closure():
    closure = []
    def append(value):
        closure.append(value)
    return append, closure

func, closure = make_closure()
func(1)
func2 = copy.deepcopy(func)
func2(2)
print(func2 is func)
print(closure)

Output:

True
[1, 2]

@vstinner
Copy link
Member Author

Similar bug without attrs nor dataclasses: bpo-29944 "Argumentless super() fails in classes constructed with type()".

@vstinner
Copy link
Member Author

See also the types.new_class() function:
https://docs.python.org/dev/library/types.html#types.new_class

Oh, I didn't know this function!

@vstinner
Copy link
Member Author

If I understand correctly, a cell content can be modified since Python 3.7: since commit 64505a1 of bpo-30486

Moreover, it's possible to create a cell object since Python 3.8, commit df8d2cd of bpo-35911.

@vstinner
Copy link
Member Author

bpo-32176 "Zero argument super is broken in 3.6 for methods with a hacked __class__ cell" added test_code.test_closure_injection() and fixed the CO_NOFREE flag in the code object constructor (types.CodeType).

@vstinner
Copy link
Member Author

The same problem exists at the function level: see bpo-39805: "Copying functions doesn't actually copy them".

See also bpo-14369 "make function __closure__ writable".

@vstinner
Copy link
Member Author

Note: Implementing a metaclass in Python is hard, it's easy to mess up with closures: see bpo-29270 "ctypes: fail to create a _ctypes._SimpleCData subclass using a closure like calling super() without arguments". type.__new__() is called twice on the same type dict, and the second call overrides the __classcell__ cell value.

@JelleZijlstra
Copy link
Member

I believe the attrs code wouldn't work if a method is decorated with a decorator that wraps the original function, such as @functools.cache.

@vstinner
Copy link
Member Author

The stdlib types module looks like a better place for such new function, rather than the functools module.

The types module documentation starts with: "This module defines utility functions to assist in dynamic creation of new types."
https://docs.python.org/3/library/types.html

@vstinner vstinner changed the title Add functools.copy_class() which updates closures Add types.copy_class() which updates closures Mar 28, 2022
@vstinner vstinner changed the title Add functools.copy_class() which updates closures Add types.copy_class() which updates closures Mar 28, 2022
@vstinner
Copy link
Member Author

Jelle Zijlstra:

I believe the attrs code wouldn't work if a method is decorated with a decorator that wraps the original function, such as @functools.cache.

What do you mean by "wouldn't work"? Do you mean that the semantics of "copy_class()" should be better defined? Currently, copy.deepcopy(MyClass) doesn't copy the class at all, it returns the class unmodified :-)

@functools.cache is designed for unbound methods.

Example:
---

import attr
import functools

@attr.s(slots=True)
class A:
    @staticmethod
    @functools.cache
    def incr(x):
        return x + 1

    @staticmethod
    @functools.lru_cache
    def incr_lru(x):
        return x + 1

obj = A()
print(obj.incr(1))
print(obj.incr_lru(2))

Output (current Python main branch, attrs 21.4.0):
---
2
3
---

@attr.s(slots=True) copies a class but then drops the original class. It doesn't create two classes which share methods, functools caches, etc.

@JelleZijlstra
Copy link
Member

I mean that the code sample above from attrs doesn't properly update the closure for wrapped methods, such as those created by @functools.cache, or any other arbitrary decorator that creates a wrapper function.

Example (with Python 3.9.4 and attrs 21.4.0):

% cat attrslots.py
import attr
import functools

class Base:
    @classmethod
    def f(cls):
        return 3


@attr.s(slots=True)
class Child(Base):
    x: int

    @classmethod
    @functools.cache
    def f(cls):
        return super().f() + 1


print(Child.f())
% python attrslots.py 
Traceback (most recent call last):
  File "/Users/jelle/py/pyanalyze/samples/attrslots.py", line 21, in <module>
    print(Child.f())
  File "/Users/jelle/py/pyanalyze/samples/attrslots.py", line 18, in f
    return super().f() + 1
TypeError: super(type, obj): obj must be an instance or subtype of type

If we provide a types.copy_class(), it should handle this case correctly.

@frenzymadness
Copy link
Mannequin

frenzymadness mannequin commented Apr 2, 2022

Do you think it's a good idea to start a PR with a copy of the implementation from attrs for Python 3.11? We can then add tests for the new function and also some for dataclasses where this new function is needed and try to find all corner cases.

@vstinner
Copy link
Member Author

vstinner commented Apr 2, 2022

Lumír Balhar:

Do you think it's a good idea to start a PR with a copy of the implementation from attrs for Python 3.11? We can then add tests for the new function and also some for dataclasses where this new function is needed and try to find all corner cases.

I'm worried that attrs license is MIT with an "advertisement clause" (the MIT license must be mentioned), whereas Python has a different license. I'm planning to contact the 3 authors of the code to ask their permission.

Also, I expect more feedback on this "idea" first:

I would like to implement this function, but first I would like to discuss if it makes sense to add such function and check if it's the right abstraction.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@vstinner
Copy link
Member Author

vstinner commented May 3, 2022

The attrs code is not enough to implement a general purpose copy_class() function:

  • Methods are copied unchanged
  • Methods names are not updated for the new class name: copied class methods have the name of the original class :-(
  • Cells are shared between the original and the copied class: calling a method of the original class now refers to the copied class :-(

Changing method names requires to create new code objects. Cell objects must be copied manually.

For all these reasons, I prefer to abandon my idea of adding a general purpose copy_class() method, but fix attrs instead. The attrs project has a function very specific to replace a class with a new class with slots. That's more specific and easier to implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes stdlib Python modules in the Lib dir
Projects
None yet
Development

No branches or pull requests

2 participants