New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only hash model_fields
, not whole __dict__
#7786
Conversation
Have you tried other implementation to compare performance, say a dict comprehension? I'd be curious to see how various implementation compare |
This: def hash_func(self: Any) -> int:
d = self.__dict__
return hash(self.__class__) + hash(tuple(d[k] for k in cls.model_fields)) is about 2 to 3 times slower than the implementation in this PR. I'm actually surprised it's not worse. We always hear about how slow Python is and how deferring to C makes a huge difference, but I often get underwhelming results when doing so. |
Interesting comparison ! def hash_func(self: Any) -> int:
d = {self.__dict__[k] for k in cls.model_fields}
return hash(self.__class__) + hash(tuple(d.values())) Sometimes dict-comprehension are very fast |
I'll leave you to try these yourself. Also, |
I ran your benchmark and plotted the time taken as a function of the number of fields, your implementation is definitely the fastest, and is faster than the current one for very small objects. I tweaked the implementation to call This makes a good starting point to fix Code for the benchmarkimport gc
import operator
import timeit
from typing import Type
import matplotlib.pyplot as plt
import pydantic
import tqdm.auto as tqdm
def set_hash_func_default(cls):
pass
def set_hash_func_operator(cls: Type[pydantic.BaseModel]):
getter = operator.itemgetter(*cls.model_fields.keys()) if cls.model_fields else lambda _: None
def hash_func(self) -> int:
return hash((self.__class__, getter(self.__dict__)))
cls.__hash__ = hash_func
def set_hash_func_tuple(cls: Type[pydantic.BaseModel]):
def hash_func(self) -> int:
d = self.__dict__
return hash((self.__class__, tuple(d[k] for k in cls.model_fields)))
cls.__hash__ = hash_func
def set_hash_func_comprehension(cls: Type[pydantic.BaseModel]):
def hash_func(self) -> int:
d = self.__dict__
fields = {k: d[k] for k in cls.model_fields}
return hash((self.__class__, tuple(fields.values())))
cls.__hash__ = hash_func
setup = """
import pydantic
Model = pydantic.create_model(
"Model",
__config__=dict(frozen=True),
**{f'x{i}': (int, i) for i in range(%(size)d)}
)
set_hash_func(Model)
instance = Model()
"""
statement = "hash(instance)"
sizes = list(range(100))
N_EXEC = 10_000
ys = {}
for set_hash_func in [
set_hash_func_default,
set_hash_func_operator,
set_hash_func_tuple,
set_hash_func_comprehension,
]:
namespace = {"set_hash_func": set_hash_func}
ys[set_hash_func.__name__] = [
# Take the min of the repeats as recommended by timeit doc
min(
timeit.Timer(setup=setup % {"size": size}, stmt=statement, globals=namespace).repeat(
repeat=5, number=N_EXEC
)
)
/ N_EXEC
for size in tqdm.tqdm(sizes, desc=set_hash_func.__name__)
]
gc.collect()
fig, ax = plt.subplots()
for name, times in ys.items():
ax.plot(sizes, times, label=name)
ax.set_title("Performance benchmark for __hash__ implementations")
ax.set_xlabel("Number of fields in the pydantic model")
ax.set_ylabel("Average hash() time (s)")
plt.legend()
plt.tight_layout()
fig.savefig("results.png", dpi=300, facecolor="white", transparent=False) |
I think the |
That makes sense. I didn't see #7785 before. I would also be in favor of keeping |
getter = operator.itemgetter(*cls.model_fields.keys()) if cls.model_fields else lambda _: None | ||
|
||
def hash_func(self: Any) -> int: | ||
return hash(self.__class__) + hash(tuple(self.__dict__.values())) | ||
return hash(self.__class__) + hash(getter(self.__dict__)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this faster than just doing
def hash_func(self: Any) -> int:
return hash(self.__class__) + hash(tuple(self.__dict__.get(k) for k in cls.model_fields))
I'd prefer to avoid adding the operator
import if possible (as we have generally been trying to reduce the number of imports pydantic triggers, which increase startup time), but if it's noticeably faster then maybe it makes sense. I think if you can't see a significant and repeatable difference, better to not use it.
Unless I've misunderstood and the getter
thing is doing something different from my suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh I see there was a whole conversation about this and confirmation that it is significantly faster. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we should confirm with @samuelcolvin though given the point about wanting to reduce import times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you propose using .get()
instead of []
. Do you know whether it is safe to use itemgetter()
, which doesn't have a default and thus will raise an error if the key is not found?
Said differently, are there cases where some keys from cls.model_fields
are not present on the instance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Said differently, are there cases where some keys from cls.model_fields are not present on the instance?
I tried to find ways that this might happen, which led me to #7784. I couldn't think of anything else but I don't actually know pydantic that deeply so I could easily be missing something. But my guess is that if it's possible then it's enough of a red flag that raising an error here would actually be a good thing.
Regarding pydantic load time, it seems pydantic already imports It could be that |
I performed a timing benchmark, I can't measure any difference in importing operator and pydantic vs. importing only pydantic. Time pydantic alone: for i in {1..100}; do { time -p python -c "import pydantic"; } 2>> import_time.txt; done
python -c 'import pathlib; print(min(float(line.split(" ")[1]) for line in pathlib.Path("import_time.txt").read_text().split("\n") if line.startswith("real"
)))' Time pydantic + opeator: for i in {1..100}; do { time -p python -c "import pydantic, operator"; } 2>> import_time_operator.txt; done
python -c 'import pathlib; print(min(float(line.split(" ")[1]) for line in pathlib.Path("import_time_operator.txt").read_text().split("\n") if line.startswith("real")))' |
I did further benchmark in case we need to handle some keys in Interestingly, performance of the stdlib varies with python version: on python 3.8, only wrapping I guess we could implement a switch based on the size of the Python 3.8 runsPython 3.12 runsBenchmark codeBenchmark codeimport collections
import dataclasses
import gc
import operator
import sys
import timeit
from typing import Any, Callable, Dict, Set, Tuple, Type
import matplotlib.pyplot as plt
import numpy as np
import pydantic
import tqdm.auto as tqdm
PYTHON_VERSION = ".".join(map(str, sys.version_info))
def safe_itemgetter(*items) -> Callable[[Dict[str, Any]], Tuple[Any, ...]]:
if not items:
return lambda _: tuple()
key_set = frozenset(items)
getter = operator.itemgetter(*items)
def safe_wrapper(d: Dict[str, Any]) -> Tuple[Any, ...]:
return (
getter(d) if d.keys() >= key_set else getter(collections.defaultdict(lambda: None, d))
)
return safe_wrapper
def noop(*args, **kwargs):
pass
def set_hash_func_itemgetter(cls: Type[pydantic.BaseModel]):
getter = operator.itemgetter(*cls.model_fields.keys()) if cls.model_fields else lambda _: None
def hash_func(self) -> int:
return hash(getter(self.__dict__))
cls.__hash__ = hash_func
def set_hash_func_itemgetter_defaultdict(cls: Type[pydantic.BaseModel]):
getter = operator.itemgetter(*cls.model_fields.keys()) if cls.model_fields else lambda _: None
def hash_func(self) -> int:
return hash(getter(collections.defaultdict(lambda: None, self.__dict__)))
cls.__hash__ = hash_func
def set_hash_func_itemgetter_safe(cls: Type[pydantic.BaseModel]):
getter = safe_itemgetter(*cls.model_fields.keys())
def hash_func(self) -> int:
return hash(getter(self.__dict__))
cls.__hash__ = hash_func
def set_hash_func_tuple_comprehension(cls: Type[pydantic.BaseModel]):
def hash_func(self) -> int:
d = self.__dict__
return hash(tuple(d[k] for k in cls.model_fields))
cls.__hash__ = hash_func
def set_hash_func_get_comprehension(cls: Type[pydantic.BaseModel]):
def hash_func(self) -> int:
d = self.__dict__
return hash(tuple(d.get(k) for k in cls.model_fields))
cls.__hash__ = hash_func
def set_hash_func_dict_comprehension(cls: Type[pydantic.BaseModel]):
def hash_func(self) -> int:
d = self.__dict__
fields = {k: d[k] for k in cls.model_fields}
return hash(tuple(fields.values()))
cls.__hash__ = hash_func
SETUP = """
import pydantic
Model = pydantic.create_model(
"Model",
__config__=dict(frozen=True),
**{f'x{i}': (int, i) for i in range(%(size)d)}
)
set_hash_func(Model)
instance = Model()
# Mimic @functools.cached_property editing __dict__
# instance.__dict__["not_a_field"] = None
"""
STATEMENT = "hash(instance)"
def time_hash_func(
set_hash_func,
*,
sizes,
n_execution: int,
n_repeat: int = 5,
progress_bar: bool = True,
) -> np.ndarray:
namespace = {"set_hash_func": set_hash_func}
times = [
# Take the min of the repeats as recommended by timeit doc
min(
timeit.Timer(
setup=SETUP % {"size": size},
stmt=STATEMENT,
globals=namespace,
).repeat(repeat=n_repeat, number=n_execution)
)
/ n_execution
for size in tqdm.tqdm(sizes, desc=set_hash_func.__name__, disable=not progress_bar)
]
gc.collect()
return np.array(times, dtype=float)
N_EXEC = 10_000
fig, ax = plt.subplots()
sizes = list(range(100))
baseline = time_hash_func(noop, sizes=sizes, n_execution=N_EXEC)
# Plot the baseline
ax.plot(sizes, baseline / baseline, label="baseline")
for label, set_hash_func in {
"itemgetter": set_hash_func_itemgetter,
"itemgetter+defaultdict": set_hash_func_itemgetter_defaultdict,
"itemgetter+safety": set_hash_func_itemgetter_safe,
# Those have bad performances and are deactivated to speed-up experimentation
# "tuple-comprehension": set_hash_func_tuple_comprehension,
# "get-comprehension": set_hash_func_get_comprehension,
# "dict-comprehension": set_hash_func_dict_comprehension,
}.items():
times = time_hash_func(set_hash_func, sizes=sizes, n_execution=N_EXEC)
ax.plot(sizes, times / baseline, label=label)
ax.set_title(f"__hash__ timing benchmark (python {PYTHON_VERSION})")
ax.set_xlabel("Number of fields in the pydantic model")
ax.set_ylabel("Average time relative to baseline")
plt.legend()
plt.tight_layout()
fig.savefig(
f"hash-benchmark_nocache_python-{PYTHON_VERSION}.png",
dpi=300,
facecolor="white",
transparent=False,
) |
Thanks @QuentinSoubeyranAqemia for all your analysis. Regarding importing operator, in addition to the points above, there's already a |
I tried the following code, inspired by class Model(pydantic.BaseModel, frozen=True):
attr: int = pydantic.Field(init_var=False)
m = Model() # ValidationError |
@alexmojaki I think we still need to remove |
I agree, but that issue is still in pending state. @dmontagu WDYT, can I remove |
Maybe we could add a test in |
@alexmojaki Just FYI, #7825 revealed an edge-case where you can have an instance that is missing some fields. This impacts your use of |
Thanks @QuentinSoubeyranAqemia. That means this PR should also fix another bug, since the current implementation assumes that the order of from pydantic import BaseModel
class A(BaseModel, frozen=True):
x: int
y: int
a = A(x=1, y=2)
a2 = a.copy(exclude={"x"}).copy(update={"x": 1})
print(a.__dict__) # {'x': 1, 'y': 2}
print(a2.__dict__) # {'y': 2, 'x': 1}
assert a == a2
assert hash(a) != hash(a2) I've pushed a fix with the fastest overall approach I could find, and I expect it can be reused for While I'm at it I removed the
I think this is a mistake, and that it's OK for the hashes to match if only the classes differ. |
…k in test_hash_function_give_different_result_for_different_object for differing classes
I'll leverage the Getting models with missing fields is very weird, it violates the Liskov substitution principle. Interestingly, the documented alternative to the deprecated For completeness sake's, I ran the benchmarks again with your safety mecanism: it has negligible performance impact on python 3.8, and since python 3.11 got zero-cost exception handling, impact is non-existent in python 3.11+. Benchmark code (clock to reveal)import collections
import gc
import itertools
import operator
import sys
import textwrap
import timeit
from os import name
from typing import Any, Dict, Iterable, List, Optional, Sized, Type
import matplotlib.pyplot as plt
import numpy as np
import pydantic
import tqdm.auto as tqdm
from matplotlib import axes, figure
PYTHON_VERSION = ".".join(map(str, sys.version_info))
def noop(*args, **kwargs):
pass
def set_hash_func_itemgetter(cls: Type[pydantic.BaseModel]):
getter = operator.itemgetter(*cls.model_fields.keys()) if cls.model_fields else lambda _: None
def hash_func(self) -> int:
return hash(getter(self.__dict__))
cls.__hash__ = hash_func
def set_hash_func_itemgetter_defaultdict(cls: Type[pydantic.BaseModel]):
getter = operator.itemgetter(*cls.model_fields.keys()) if cls.model_fields else lambda _: None
def hash_func(self) -> int:
return hash(getter(collections.defaultdict(lambda: None, self.__dict__)))
cls.__hash__ = hash_func
class FallbackDict:
def __init__(self, inner):
self.inner = inner
def __getitem__(self, key):
return self.inner.get(key)
def set_hash_func_itemgetter_safe(cls: Type[pydantic.BaseModel]):
getter = operator.itemgetter(*cls.model_fields.keys()) if cls.model_fields else lambda _: 0
def hash_func(self: Any) -> int:
try:
return hash(getter(self.__dict__))
except KeyError:
# In rare cases (such as when using the deprecated copy method), the __dict__ may not contain
# all model fields, which is how we can get here.
# getter(self.__dict__) is much faster than any 'safe' method that accounts for missing keys,
# and wrapping it in a `try` doesn't slow things down much in the common case.
return hash(getter(FallbackDict(self.__dict__))) # type: ignore
cls.__hash__ = hash_func
def set_hash_func_tuple_comprehension(cls: Type[pydantic.BaseModel]):
def hash_func(self) -> int:
d = self.__dict__
return hash(tuple(d[k] for k in cls.model_fields))
cls.__hash__ = hash_func
def set_hash_func_get_comprehension(cls: Type[pydantic.BaseModel]):
def hash_func(self) -> int:
d = self.__dict__
return hash(tuple(d.get(k) for k in cls.model_fields))
cls.__hash__ = hash_func
def set_hash_func_dict_comprehension(cls: Type[pydantic.BaseModel]):
def hash_func(self) -> int:
d = self.__dict__
fields = {k: d[k] for k in cls.model_fields}
return hash(tuple(fields.values()))
cls.__hash__ = hash_func
class SizedIterable(Sized, Iterable):
pass
def run_benchmark(
title: str,
setup: str,
statement: str,
n_execution: int = 10_000,
n_repeat: int = 5,
globals: Optional[Dict[str, Any]] = None,
progress_bar: bool = True,
params: Optional[Dict[str, SizedIterable]] = None,
) -> np.ndarray:
namespace = globals or {}
# fast-path
if not params:
length = 1
packed_params = [()]
else:
length = len(next(iter(params.values())))
# This iterator yields a tuple of (key, value) pairs
# First, make a list of N iterator over (key, value), where the provided values are iterated
param_pairs = [zip(itertools.repeat(name), value) for name, value in params.items()]
# Then pack our individual parameter iterator into one
packed_params = zip(*param_pairs)
times = [
# Take the min of the repeats as recommended by timeit doc
min(
timeit.Timer(
setup=setup % dict(local_params),
stmt=statement,
globals=namespace,
).repeat(repeat=n_repeat, number=n_execution)
)
/ n_execution
for local_params in tqdm.tqdm(
packed_params, desc=title, total=length, disable=not progress_bar
)
]
gc.collect()
return np.asarray(times, dtype=float)
def run_hash_benchmark(
title: str,
set_hash_func,
sizes: SizedIterable,
mimic_cached_property: bool,
n_execution: int = 10_000,
n_repeat: int = 5,
) -> np.ndarray:
setup = textwrap.dedent(
f"""
import pydantic
Model = pydantic.create_model(
"Model",
__config__=dict(frozen=True),
**{{f'x{{i}}': (int, i) for i in range(%(size)d)}}
)
set_hash_func(Model)
instance = Model()
# Mimic @functools.cached_property editing __dict__
{"instance.__dict__['not_a_field'] = None" if mimic_cached_property else ''}
"""
)
statement = "hash(instance)"
namespace = {"set_hash_func": set_hash_func}
return run_benchmark(
title=title,
setup=setup,
statement=statement,
n_execution=n_execution,
n_repeat=n_repeat,
globals=namespace,
params={"size": sizes},
)
def plot_benchmark(
title: str,
sizes: List[int],
mimic_cached_property: bool,
ax: Optional[axes.Axes] = None,
):
ax = ax or plt.gca()
baseline = run_hash_benchmark(
title=f'{title}, baseline',
set_hash_func=noop,
sizes=sizes,
mimic_cached_property=mimic_cached_property,
)
ax.plot(sizes, baseline / baseline, label='baseline')
for label, set_hash_func in {
"itemgetter": set_hash_func_itemgetter,
"itemgetter+defaultdict": set_hash_func_itemgetter_defaultdict,
"itemgetter+safety": set_hash_func_itemgetter_safe,
# Those have bad performances and are deactivated to speed-up experimentation
# "tuple-comprehension": set_hash_func_tuple_comprehension,
# "get-comprehension": set_hash_func_get_comprehension,
# "dict-comprehension": set_hash_func_dict_comprehension,
}.items():
times = run_hash_benchmark(
title=f'{title}, {label}',
set_hash_func=set_hash_func,
sizes=sizes,
mimic_cached_property=mimic_cached_property,
)
ax.plot(sizes, times / baseline, label=label)
ax.set_title(title)
ax.set_xlabel('Number of pydantic fields')
ax.set_ylabel('Average time relative to baseline')
return ax
def plot_all_benchmarks(sizes: List[int]) -> figure.Figure:
n_rows, n_cols = 1, 2
fig, axes = plt.subplots(n_rows, n_cols, figsize=(n_cols * 6, n_rows * 4))
for col, mimic_cached_property in enumerate([False, True]):
plot_benchmark(
title=f'{mimic_cached_property=}',
sizes=sizes,
mimic_cached_property=mimic_cached_property,
ax=axes[col],
)
for ax in axes.ravel():
ax.legend()
fig.suptitle(f"__hash__ timing benchmark (python {PYTHON_VERSION})")
return fig
if __name__ == "__main__":
fig = plot_all_benchmarks(list(range(100)))
plt.tight_layout()
target = f"hash-benchmark_python-{PYTHON_VERSION}.png"
fig.savefig(
target,
dpi=300,
facecolor="white",
transparent=False,
)
print(f"Wrote {target}", file=sys.stderr) |
@alexmojaki regarding the removal of the class in the hash — I think it might make more sense for it to grab the generic origin and include that in the hash (same as is done by the equality method). That would at least keep different classes with the same data dicts having different hashes, presumably reducing collisions. But obviously that's already going to be rare so maybe it is better to not worry about it and let the equality check differentiate. (And I guess it will make the common case of having different data a bit faster anyway due to eliminating the hashing of the type?) I'm happy to defer to you. |
(@alexmojaki I'll wait to actually merge until you give a reply to the last comment about how you want to proceed, but to be clear I'm okay merging this as is if you think that's best.) |
I think it'd be so weird to have a dict/set containing a large number of models of different types (already an oddity in normal code) with the same data that it's worth optimising the common case. |
LGTM overall, but I'd prefer to wait and merge once we release Thanks for your work on this @alexmojaki! |
model_fields
, not whole __dict__
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [pydantic](https://github.com/pydantic/pydantic) ([changelog](https://docs.pydantic.dev/latest/changelog/)) | dependencies | minor | `2.5.3` -> `2.6.1` | --- ### Release Notes <details> <summary>pydantic/pydantic (pydantic)</summary> ### [`v2.6.1`](https://github.com/pydantic/pydantic/blob/HEAD/HISTORY.md#v261-2024-02-05) [Compare Source](pydantic/pydantic@v2.6.0...v2.6.1) [GitHub release](https://github.com/pydantic/pydantic/releases/tag/v2.6.1) ##### What's Changed ##### Packaging - Upgrade to `pydantic-core` 2.16.2 by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8717](pydantic/pydantic#8717) ##### Fixes - Fix bug with `mypy` plugin and `no_strict_optional = True` by [@​dmontagu](https://github.com/dmontagu) in [#​8666](pydantic/pydantic#8666) - Fix `ByteSize` error `type` change by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8681](pydantic/pydantic#8681) - Fix inheriting `Field` annotations in dataclasses by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8679](pydantic/pydantic#8679) - Fix regression in core schema generation for indirect definition references by [@​dmontagu](https://github.com/dmontagu) in [#​8702](pydantic/pydantic#8702) - Fix unsupported types bug with `PlainValidator` by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8710](pydantic/pydantic#8710) - Reverting problematic fix from 2.6 release, fixing schema building bug by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8718](pydantic/pydantic#8718) - Fix warning for tuple of wrong size in `Union` by [@​davidhewitt](https://github.com/davidhewitt) in [pydantic/pydantic-core#1174](pydantic/pydantic-core#1174) - Fix `computed_field` JSON serializer `exclude_none` behavior by [@​sydney-runkle](https://github.com/sydney-runkle) in [pydantic/pydantic-core#1187](pydantic/pydantic-core#1187) ### [`v2.6.0`](https://github.com/pydantic/pydantic/blob/HEAD/HISTORY.md#v260-2024-01-23) [Compare Source](pydantic/pydantic@v2.5.3...v2.6.0) [GitHub release](https://github.com/pydantic/pydantic/releases/tag/v2.6.0) The code released in v2.6.0 is practically identical to that of v2.6.0b1. ##### What's Changed ##### Packaging - Check for `email-validator` version >= 2.0 by [@​commonism](https://github.com/commonism) in [#​6033](pydantic/pydantic#6033) - Upgrade \`ruff\`\` target version to Python 3.8 by [@​Elkiwa](https://github.com/Elkiwa) in [#​8341](pydantic/pydantic#8341) - Update to `pydantic-extra-types==2.4.1` by [@​yezz123](https://github.com/yezz123) in [#​8478](pydantic/pydantic#8478) - Update to `pyright==1.1.345` by [@​Viicos](https://github.com/Viicos) in [#​8453](pydantic/pydantic#8453) - Update pydantic-core from 2.14.6 to 2.16.1, significant changes from these updates are described below, full changelog [here](pydantic/pydantic-core@v2.14.6...v2.16.1) ##### New Features - Add `NatsDsn` by [@​ekeew](https://github.com/ekeew) in [#​6874](pydantic/pydantic#6874) - Add `ConfigDict.ser_json_inf_nan` by [@​davidhewitt](https://github.com/davidhewitt) in [#​8159](pydantic/pydantic#8159) - Add `types.OnErrorOmit` by [@​adriangb](https://github.com/adriangb) in [#​8222](pydantic/pydantic#8222) - Support `AliasGenerator` usage by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8282](pydantic/pydantic#8282) - Add Pydantic People Page to docs by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8345](pydantic/pydantic#8345) - Support `yyyy-MM-DD` datetime parsing by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8404](pydantic/pydantic#8404) - Added bits conversions to the `ByteSize` class [#​8415](pydantic/pydantic#8415) by [@​luca-matei](https://github.com/luca-matei) in [#​8507](pydantic/pydantic#8507) - Enable json schema creation with type `ByteSize` by [@​geospackle](https://github.com/geospackle) in [#​8537](pydantic/pydantic#8537) - Add `eval_type_backport` to handle union operator and builtin generic subscripting in older Pythons by [@​alexmojaki](https://github.com/alexmojaki) in [#​8209](pydantic/pydantic#8209) - Add support for `dataclass` fields `init` by [@​dmontagu](https://github.com/dmontagu) in [#​8552](pydantic/pydantic#8552) - Implement pickling for `ValidationError` by [@​davidhewitt](https://github.com/davidhewitt) in [pydantic/pydantic-core#1119](pydantic/pydantic-core#1119) - Add unified tuple validator that can handle "variadic" tuples via PEP-646 by [@​dmontagu](https://github.com/dmontagu) in [pydantic/pydantic-core#865](pydantic/pydantic-core#865) ##### Changes - Drop Python3.7 support by [@​hramezani](https://github.com/hramezani) in [#​7188](pydantic/pydantic#7188) - Drop Python 3.7, and PyPy 3.7 and 3.8 by [@​davidhewitt](https://github.com/davidhewitt) in [pydantic/pydantic-core#1129](pydantic/pydantic-core#1129) - Use positional-only `self` in `BaseModel` constructor, so no field name can ever conflict with it by [@​ariebovenberg](https://github.com/ariebovenberg) in [#​8072](pydantic/pydantic#8072) - Make `@validate_call` return a function instead of a custom descriptor - fixes binding issue with inheritance and adds `self/cls` argument to validation errors by [@​alexmojaki](https://github.com/alexmojaki) in [#​8268](pydantic/pydantic#8268) - Exclude `BaseModel` docstring from JSON schema description by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8352](pydantic/pydantic#8352) - Introducing `classproperty` decorator for `model_computed_fields` by [@​Jocelyn-Gas](https://github.com/Jocelyn-Gas) in [#​8437](pydantic/pydantic#8437) - Explicitly raise an error if field names clashes with types by [@​Viicos](https://github.com/Viicos) in [#​8243](pydantic/pydantic#8243) - Use stricter serializer for unions of simple types by [@​alexdrydew](https://github.com/alexdrydew) [pydantic/pydantic-core#1132](pydantic/pydantic-core#1132) ##### Performance - Add Codspeed profiling Actions workflow by [@​lambertsbennett](https://github.com/lambertsbennett) in [#​8054](pydantic/pydantic#8054) - Improve `int` extraction by [@​samuelcolvin](https://github.com/samuelcolvin) in [pydantic/pydantic-core#1155](pydantic/pydantic-core#1155) - Improve performance of recursion guard by [@​samuelcolvin](https://github.com/samuelcolvin) in [pydantic/pydantic-core#1156](pydantic/pydantic-core#1156) - `dataclass` serialization speedups by [@​samuelcolvin](https://github.com/samuelcolvin) in [pydantic/pydantic-core#1162](pydantic/pydantic-core#1162) - Avoid `HashMap` creation when looking up small JSON objects in `LazyIndexMaps` by [@​samuelcolvin](https://github.com/samuelcolvin) in [pydantic/jiter#55](pydantic/jiter#55) - use hashbrown to speedup python string caching by [@​davidhewitt](https://github.com/davidhewitt) in [pydantic/jiter#51](pydantic/jiter#51) - Replace `Peak` with more efficient `Peek` by [@​davidhewitt](https://github.com/davidhewitt) in [pydantic/jiter#48](pydantic/jiter#48) ##### Fixes - Move `getattr` warning in deprecated `BaseConfig` by [@​tlambert03](https://github.com/tlambert03) in [#​7183](pydantic/pydantic#7183) - Only hash `model_fields`, not whole `__dict__` by [@​alexmojaki](https://github.com/alexmojaki) in [#​7786](pydantic/pydantic#7786) - Fix mishandling of unions while freezing types in the `mypy` plugin by [@​dmontagu](https://github.com/dmontagu) in [#​7411](pydantic/pydantic#7411) - Fix `mypy` error on untyped `ClassVar` by [@​vincent-hachin-wmx](https://github.com/vincent-hachin-wmx) in [#​8138](pydantic/pydantic#8138) - Only compare pydantic fields in `BaseModel.__eq__` instead of whole `__dict__` by [@​QuentinSoubeyranAqemia](https://github.com/QuentinSoubeyranAqemia) in [#​7825](pydantic/pydantic#7825) - Update `strict` docstring in `model_validate` method. by [@​LukeTonin](https://github.com/LukeTonin) in [#​8223](pydantic/pydantic#8223) - Fix overload position of `computed_field` by [@​Viicos](https://github.com/Viicos) in [#​8227](pydantic/pydantic#8227) - Fix custom type type casting used in multiple attributes by [@​ianhfc](https://github.com/ianhfc) in [#​8066](pydantic/pydantic#8066) - Fix issue not allowing `validate_call` decorator to be dynamically assigned to a class method by [@​jusexton](https://github.com/jusexton) in [#​8249](pydantic/pydantic#8249) - Fix issue `unittest.mock` deprecation warnings by [@​ibleedicare](https://github.com/ibleedicare) in [#​8262](pydantic/pydantic#8262) - Added tests for the case `JsonValue` contains subclassed primitive values by [@​jusexton](https://github.com/jusexton) in [#​8286](pydantic/pydantic#8286) - Fix `mypy` error on free before validator (classmethod) by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8285](pydantic/pydantic#8285) - Fix `to_snake` conversion by [@​jevins09](https://github.com/jevins09) in [#​8316](pydantic/pydantic#8316) - Fix type annotation of `ModelMetaclass.__prepare__` by [@​slanzmich](https://github.com/slanzmich) in [#​8305](pydantic/pydantic#8305) - Disallow `config` specification when initializing a `TypeAdapter` when the annotated type has config already by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8365](pydantic/pydantic#8365) - Fix a naming issue with JSON schema for generics parametrized by recursive type aliases by [@​dmontagu](https://github.com/dmontagu) in [#​8389](pydantic/pydantic#8389) - Fix type annotation in pydantic people script by [@​shenxiangzhuang](https://github.com/shenxiangzhuang) in [#​8402](pydantic/pydantic#8402) - Add support for field `alias` in `dataclass` signature by [@​NeevCohen](https://github.com/NeevCohen) in [#​8387](pydantic/pydantic#8387) - Fix bug with schema generation with `Field(...)` in a forward ref by [@​dmontagu](https://github.com/dmontagu) in [#​8494](pydantic/pydantic#8494) - Fix ordering of keys in `__dict__` with `model_construct` call by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8500](pydantic/pydantic#8500) - Fix module `path_type` creation when globals does not contain `__name__` by [@​hramezani](https://github.com/hramezani) in [#​8470](pydantic/pydantic#8470) - Fix for namespace issue with dataclasses with `from __future__ import annotations` by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8513](pydantic/pydantic#8513) - Fix: make function validator types positional-only by [@​pmmmwh](https://github.com/pmmmwh) in [#​8479](pydantic/pydantic#8479) - Fix usage of `@deprecated` by [@​Viicos](https://github.com/Viicos) in [#​8294](pydantic/pydantic#8294) - Add more support for private attributes in `model_construct` call by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8525](pydantic/pydantic#8525) - Use a stack for the types namespace by [@​dmontagu](https://github.com/dmontagu) in [#​8378](pydantic/pydantic#8378) - Fix schema-building bug with `TypeAliasType` for types with refs by [@​dmontagu](https://github.com/dmontagu) in [#​8526](pydantic/pydantic#8526) - Support `pydantic.Field(repr=False)` in dataclasses by [@​tigeryy2](https://github.com/tigeryy2) in [#​8511](pydantic/pydantic#8511) - Override `dataclass_transform` behavior for `RootModel` by [@​Viicos](https://github.com/Viicos) in [#​8163](pydantic/pydantic#8163) - Refactor signature generation for simplicity by [@​sydney-runkle](https://github.com/sydney-runkle) in [#​8572](pydantic/pydantic#8572) - Fix ordering bug of PlainValidator annotation by [@​Anvil](https://github.com/Anvil) in [#​8567](pydantic/pydantic#8567) - Fix `exclude_none` for json serialization of `computed_field`s by [@​sydney-runkle](https://github.com/sydney-runkle) in [pydantic/pydantic-core#1098](pydantic/pydantic-core#1098) - Support yyyy-MM-DD string for datetimes by [@​sydney-runkle](https://github.com/sydney-runkle) in [pydantic/pydantic-core#1124](pydantic/pydantic-core#1124) - Tweak ordering of definitions in generated schemas by [@​StrawHatDrag0n](https://github.com/StrawHatDrag0n) in [#​8583](pydantic/pydantic#8583) ##### New Contributors ##### `pydantic` - [@​ekeew](https://github.com/ekeew) made their first contribution in [#​6874](pydantic/pydantic#6874) - [@​lambertsbennett](https://github.com/lambertsbennett) made their first contribution in [#​8054](pydantic/pydantic#8054) - [@​vincent-hachin-wmx](https://github.com/vincent-hachin-wmx) made their first contribution in [#​8138](pydantic/pydantic#8138) - [@​QuentinSoubeyranAqemia](https://github.com/QuentinSoubeyranAqemia) made their first contribution in [#​7825](pydantic/pydantic#7825) - [@​ariebovenberg](https://github.com/ariebovenberg) made their first contribution in [#​8072](pydantic/pydantic#8072) - [@​LukeTonin](https://github.com/LukeTonin) made their first contribution in [#​8223](pydantic/pydantic#8223) - [@​denisart](https://github.com/denisart) made their first contribution in [#​8231](pydantic/pydantic#8231) - [@​ianhfc](https://github.com/ianhfc) made their first contribution in [#​8066](pydantic/pydantic#8066) - [@​eonu](https://github.com/eonu) made their first contribution in [#​8255](pydantic/pydantic#8255) - [@​amandahla](https://github.com/amandahla) made their first contribution in [#​8263](pydantic/pydantic#8263) - [@​ibleedicare](https://github.com/ibleedicare) made their first contribution in [#​8262](pydantic/pydantic#8262) - [@​jevins09](https://github.com/jevins09) made their first contribution in [#​8316](pydantic/pydantic#8316) - [@​cuu508](https://github.com/cuu508) made their first contribution in [#​8322](pydantic/pydantic#8322) - [@​slanzmich](https://github.com/slanzmich) made their first contribution in [#​8305](pydantic/pydantic#8305) - [@​jensenbox](https://github.com/jensenbox) made their first contribution in [#​8331](pydantic/pydantic#8331) - [@​szepeviktor](https://github.com/szepeviktor) made their first contribution in [#​8356](pydantic/pydantic#8356) - [@​Elkiwa](https://github.com/Elkiwa) made their first contribution in [#​8341](pydantic/pydantic#8341) - [@​parhamfh](https://github.com/parhamfh) made their first contribution in [#​8395](pydantic/pydantic#8395) - [@​shenxiangzhuang](https://github.com/shenxiangzhuang) made their first contribution in [#​8402](pydantic/pydantic#8402) - [@​NeevCohen](https://github.com/NeevCohen) made their first contribution in [#​8387](pydantic/pydantic#8387) - [@​zby](https://github.com/zby) made their first contribution in [#​8497](pydantic/pydantic#8497) - [@​patelnets](https://github.com/patelnets) made their first contribution in [#​8491](pydantic/pydantic#8491) - [@​edwardwli](https://github.com/edwardwli) made their first contribution in [#​8503](pydantic/pydantic#8503) - [@​luca-matei](https://github.com/luca-matei) made their first contribution in [#​8507](pydantic/pydantic#8507) - [@​Jocelyn-Gas](https://github.com/Jocelyn-Gas) made their first contribution in [#​8437](pydantic/pydantic#8437) - [@​bL34cHig0](https://github.com/bL34cHig0) made their first contribution in [#​8501](pydantic/pydantic#8501) - [@​tigeryy2](https://github.com/tigeryy2) made their first contribution in [#​8511](pydantic/pydantic#8511) - [@​geospackle](https://github.com/geospackle) made their first contribution in [#​8537](pydantic/pydantic#8537) - [@​Anvil](https://github.com/Anvil) made their first contribution in [#​8567](pydantic/pydantic#8567) - [@​hungtsetse](https://github.com/hungtsetse) made their first contribution in [#​8546](pydantic/pydantic#8546) - [@​StrawHatDrag0n](https://github.com/StrawHatDrag0n) made their first contribution in [#​8583](pydantic/pydantic#8583) ##### `pydantic-core` - [@​mariuswinger](https://github.com/mariuswinger) made their first contribution in [pydantic/pydantic-core#1087](pydantic/pydantic-core#1087) - [@​adamchainz](https://github.com/adamchainz) made their first contribution in [pydantic/pydantic-core#1090](pydantic/pydantic-core#1090) - [@​akx](https://github.com/akx) made their first contribution in [pydantic/pydantic-core#1123](pydantic/pydantic-core#1123) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4xMzAuMCIsInVwZGF0ZWRJblZlciI6IjM3LjEzMC4wIiwidGFyZ2V0QnJhbmNoIjoibWFzdGVyIn0=--> Reviewed-on: https://git.apud.pl/jacek/adventofcode/pulls/57 Co-authored-by: Renovate <renovate@apud.pl> Co-committed-by: Renovate <renovate@apud.pl>
Change Summary
This deals with just the
__hash__
part of #7444, because initially I was worried that it would be hard to fix without a significant hit to performance. With this timing code:the previous approach is slightly slower for 5 fields, but about 60-70% faster for 100 fields. So this PR may slow down some programs a little, but in the larger scheme of things it's hard to imagine it making a real difference since it can still hash a million times in about 2 seconds.
Closes #7785
Checklist