New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate a hash function when frozen
is True
#1881
Generate a hash function when frozen
is True
#1881
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1881 +/- ##
=======================================
Coverage 99.90% 99.90%
=======================================
Files 25 25
Lines 5030 5036 +6
Branches 1030 1030
=======================================
+ Hits 5025 5031 +6
Misses 1 1
Partials 4 4
Continue to review full report at Codecov.
|
I think this is a good idea but I don't think this will work if the fields in the model cannot be hashed. For example what happens if you do the following: from pydantic import BaseModel
class A(BaseModel):
x: int
y: List[int]
class Config:
allow_mutation = False
a = A(x=1, y=[1,2,3])
d = {a: 2}
d[a]
>>> ? Might be worth putting in a validation step to ensure that the fields are hashable? Or alternatively I think this would be a better approach as it should allow you to do nested immutability: from pydantic import BaseModel
class B(BaseModel):
z: int
class Config:
allow_mutation = False
class A(BaseModel):
x: int
y: B
class Config:
allow_mutation = False
a = A(x=1, y=B(z=4))
d = {a: 2} |
I've had to roll my own
The |
Hi @ccharlesgb and @layday , thanks for your answer. I agree with your comments and I corrected my implementation of What do you think ? If I am correct, this has now the same behavior has in
|
pydantic/main.py
Outdated
@@ -313,6 +313,7 @@ def __new__(mcs, name, bases, namespace, **kwargs): # noqa C901 | |||
'__schema_cache__': {}, | |||
'__json_encoder__': staticmethod(json_encoder), | |||
'__custom_root_type__': _custom_root_type, | |||
'__hash__': (lambda self: hash(tuple(self.__dict__.values()))) if not config.allow_mutation else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of .values
we should probably have .items
here. The keys ought to be considered a part of the identity of the object.
Example:
x = {"a": 1, "b": True}
y = {"d": 1, "c": True}
hash(tuple(x.values())) == hash(tuple(y.values())) # True
hash(tuple(x.items())) == hash(tuple(y.items())) # False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, we diverge from behaviour of the built-in dataclass. But I do not know if it is a problem.
If it not a problem, then I agree with you ! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs to be a lambda, it can be proper method on BaseModel
I imagine. It could raise an error if allow_mutation is True
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs to be a lambda, it can be proper method on
BaseModel
I imagine. It could raise an error ifallow_mutation is True
.
The 'correct' way to make an object unhashable is by setting its __hash__
attribute to None
:
In [1]: class Foo:
...: pass
...:
In [2]: class Bar:
...: __hash__ = None
...:
In [3]: hash(Foo())
Out[3]: 284941123
In [4]: hash(Bar())
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-d413494abdfb> in <module>
----> 1 hash(Bar())
TypeError: unhashable type: 'Bar'
This would have to happen on the metaclass and is done automatically by the Python interpreter for classes which implement __eq__
but not __hash__
:
In [5]: from pydantic import BaseModel
In [6]: class Baz(BaseModel):
...: pass
...:
In [7]: hash(Baz())
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-67f7cb492b68> in <module>
----> 1 hash(Baz())
TypeError: unhashable type: 'Baz'
In [8]: Baz.__hash__
In [9]: class Bee:
...: def __eq__(self, other):
...: ...
...:
In [10]: Bee.__hash__
In [11]: Foo.__hash__
Out[11]: <slot wrapper '__hash__' of 'object' objects>
If __hash__
were implemented on BaseModel
, we'd have to emulate the default behaviour. In addition, if __hash__
is not None
, isinstance(obj, typing.Hashable)
will not have the desired effect:
In [12]: from typing import Hashable
In [13]: isinstance(Bee(), Hashable)
Out[13]: False
In [14]: class FauxUnhashable:
...: def __hash__(self):
...: raise TypeError
...:
In [51]: isinstance(FauxUnhashable(), Hashable)
Out[51]: True
Reference: https://docs.python.org/3/reference/datamodel.html#object.__hash__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, so it can't just be a method on BaseModel
, but let's make it a proper function, not a lambda.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of
.values
we should probably have.items
here. The keys ought to be considered a part of the identity of the object.
I think this would be unnecessary if we include a reference to the model class.
m = TestModel() | ||
with pytest.raises(TypeError) as exc_info: | ||
hash(m) | ||
assert "unhashable type: 'list'" in exc_info.value.args[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO would be nice for the exception message to be more like "unhashable type: 'TestModel' contains unhashable type: 'list' but I wouldn't say it's a strong opinion. Food for thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good start, but my problem here is that I imagine in the following code f
and b
would have the same hash and therefore be "equal"
class Foo(BaseModel):
a: str
b: int
class Config:
allow_mutation = False
class Bar(BaseModel):
a: str
b: int
def dict(self, *args, **kwargs):
d = super().dict(*args, **kwargs)
d.pop('a')
return d
class Config:
allow_mutation = False
f = Foo(a='xx', b=2)
b = Bar(a='xx', b=3)
Which is definitely not what people would expected.
I think we need to add some kind of reference to the qualname to the hash.
I think we also need to hide this behind another config option.
pydantic/main.py
Outdated
@@ -313,6 +313,7 @@ def __new__(mcs, name, bases, namespace, **kwargs): # noqa C901 | |||
'__schema_cache__': {}, | |||
'__json_encoder__': staticmethod(json_encoder), | |||
'__custom_root_type__': _custom_root_type, | |||
'__hash__': (lambda self: hash(tuple(self.__dict__.values()))) if not config.allow_mutation else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs to be a lambda, it can be proper method on BaseModel
I imagine. It could raise an error if allow_mutation is True
.
Would hashing the model class not work, e.g. |
I guess it might work, attrs might do it that way to support older versions of python. |
as long as this feature is hidden behind a 'hashable` config parameter, I'm not too bothered. We can say it's new and in "beta" or something in the docs. |
hmm, would |
So 2 things to modify:
|
|
Sounds good to me.
also sounds good to me. |
@rhuille are you going to finish this? otherwise let's lose it. |
Hi @samuelcolvin yes I am doing this now. Thanks for the specifications. |
For now, `frozen` is a strict duplication of `allow_mutation` parameter i.e. setting `frozen=True` does everything that `allow_mutation=False` does. NB: this does not change the behavior of `allow_mutation`. In next commit, setting `frozen=True` will also make the BaseModel hashable while the existing behavior of `allow_mutation` will not be updated.
b1a84dc
to
f7368ec
Compare
allow_mutation
is False
frozen
is True
f7368ec
to
c9098eb
Compare
45da2e7
to
42cd7ba
Compare
Now, setting `frozen=True` also generate a hash function for the model i.e. `__hash__` is not `None`. This makes instances of the model potentially hashable if all the attributes are hashable. (default: `False`)
42cd7ba
to
b4befc5
Compare
Hi @samuelcolvin I have reworked the PR (and updated the description) to match the specifications specified by @layday and you. Thanks a lot for your feedback ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks quite good.
Do we want to tackle the case frozen=True
and allow_mutation=True
and raise an error in this case ?
Good question ! For now, if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise I think this looks great.
tests/mypy/modules/plugin_fail.py
Outdated
frozen = True | ||
extra = Extra.forbid | ||
|
||
def config_method(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need this?
tests/mypy/modules/plugin_fail.py
Outdated
x: int | ||
y: str | ||
|
||
def method(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be removed too?
tests/mypy/modules/plugin_fail.py
Outdated
... | ||
|
||
|
||
frozenmodel = FrozenModel(x=1, y='y', z='z') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't testing frozen behaviour I think, again can be removed.
tests/mypy/modules/plugin_fail.py
Outdated
FrozenModel.from_orm({}) | ||
FrozenModel.from_orm({}) # type: ignore[pydantic-orm] # noqa F821 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, are these lines testing frozen behaviour?
tests/mypy/modules/plugin_success.py
Outdated
|
||
|
||
class NotFrozenModel(FrozenModel): | ||
a = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a type here to avoid extra errors in plugin-success-strict.txt
tests/test_main.py
Outdated
|
||
immutable = allow_mutation_ is False or frozen_ is True | ||
|
||
if immutable: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than this if clause, better to setup two separate tests.
Before: there were errors about other stuff than frozen behavior After: The modification catch only errot related to the frozen behavior
One function tests the behavior: 'the model is mutable' The other tests the behavior:OC 'the model is immutable'
Hi @samuelcolvin , thanks for your review ! Since your review:
Also, if you want me to, I can clean the commits history. Thanks a lot, I am very proud of this work :D |
This is great, thank you so much. |
Hi, I've been playing with master (at 90df33c) and am concerned that this eliminates the ability for a user-defined I'm all in favour of auto-generating it upon request, but I wonder if it would be less intrusive to user code to avoid setting propose changing
to
(or some other mechanism e.g. popping again from the new namespace) I might be missing something though - is there a suggested approach that would let me define the hash for my model? Without it, I can't put my model instances in a set, dictionary etc. Motivating example: having a model which the |
I don't understand @davidhyman |
@PrettyWood I think you have fixed it in #2423 Thank you! (and thanks to all the other contributors) Sorry I was late to the party. I spent a bit of time on a repro today, originally thought I couldn't reproduce it and then realised that it's fixed (at least, in 1.8.1). I don't think I'd realised it was an inheritance thing; recalled commenting out the hash assignment, so was sure it was something that had originally changed in this MR 🤔 . import pydantic
class P(pydantic.BaseModel):
a: int
def __hash__(self):
"""custom hash function"""
return self.a
class M(P):
"""our hash function is still overwritten."""
b: dict = dict()
class Config:
"""pydantic config."""
allow_mutation = False
frozen = True
m = M(a=5)
assert m.__hash__() == 5 |
Change Summary
Before:
After:
Note that we still have:
Screenshot of the updated documentation:
Related issue number
closes #1880
linked issue: #1303
Checklist
changes/<pull request or issue id>-<github username>.md
file added describing change(see changes/README.md for details)