Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #440: Incorrect pickles for subclasses of generic classes #448

Merged
merged 3 commits into from Feb 16, 2022

Conversation

huzecong
Copy link
Contributor

This PR fixes #440: Incorrect pickles for subclasses of generic classes.

As pointed out in the issue comment, the root of the issue lies in _get_bases, which returned incorrect values for subclasses of generic classes. For example, the GLeafAny class in the issue was determined to have original bases of (__main__.Generic[~T],), although its actual base is GDerivedAny, and since that is not an _GenericAlias, GLeafAny shouldn't have __orig_bases__ at all.

However, in _get_bases, we were using hasattr(typ, '__orig_bases__') to check if the class has __orig_bases__, which would go through to its bases classes, in this case, to GDerivedAny.__orig_bases__. This is wrong; we should only use __orig_bases__ if it's defined on the current class. Thus my change.

@huzecong huzecong marked this pull request as ready for review September 18, 2021 02:38
@codecov
Copy link

codecov bot commented Sep 20, 2021

Codecov Report

Merging #448 (3e4b74b) into master (5d89947) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #448   +/-   ##
=======================================
  Coverage   92.56%   92.56%           
=======================================
  Files           4        4           
  Lines         713      713           
  Branches      156      156           
=======================================
  Hits          660      660           
  Misses         32       32           
  Partials       21       21           
Impacted Files Coverage Δ
cloudpickle/cloudpickle.py 88.53% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5d89947...3e4b74b. Read the comment docs.

@huzecong
Copy link
Contributor Author

Hi, can I get a review on this? I know you're probably all busy and maintain this in your free time, but this PR is small and shouldn't take much of your time, and I'd appreciate if we can get this into the next release. Thanks!

Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix and the test. LGTM. Can you please document it in the changelog?

Copy link
Member

@pierreglaser pierreglaser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, thanks @huzecong, and sorry for the delay.
Also, is the issue about __module__ that you mentioned in #440 (comment) fixed in/by this PR?

class Base(typing.Generic[T]):
pass

class DerivedAny(Base):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is deriving from Base (which is a user-defined generic type) without specifying the type T a realistic typing use case? Otherwise, I think we should just keep the DerivedInt/LeafInt classes in the test, which, from a quick glance seem to be regular typing/subclassing pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that this is a realistic use case. Consider this scenario: a library author wrote a generic base class, but users of this library may not be that proficient with typing (or they chose to not type annotate stuff at all), so they inherit from the generic base class without specifying the type. I've seen this quite a lot actually.

PEP 484 also states:

Subclassing a generic class without specifying type parameters assumes Any for each position.

So I think it's important that we support this use case in cloudpickle as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, that's convincing.

@huzecong
Copy link
Contributor Author

huzecong commented Feb 4, 2022

is the issue about __module__ that you mentioned in #440 (comment) fixed in/by this PR?

Unfortunately no, and I also believe it is outside of the scope of this PR.

The reason we're seeing types.ClassName is because the class is reconstructed using types.new_class:

import types

cls = types.new_class("TestClass", (), {})
print(cls)  # <class 'types.TestClass'>
cls.__module__ = "__main__"
print(cls)  # <class '__main__.TestClass'>

We could fix it by also recording the class's __module__ in _dynamic_class_reduce and assigning it back during reconstruction, but I think that is less important and deserves a separate PR.

@pierreglaser pierreglaser merged commit 1bf5277 into cloudpipe:master Feb 16, 2022
@pierreglaser
Copy link
Member

Thanks a lot @huzecong, merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect pickles for subclasses of generic classes
3 participants