Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add smart_deepcopy (originaly from #1679) #1920

Merged
merged 4 commits into from
Oct 8, 2020

Conversation

Bobronium
Copy link
Contributor

@Bobronium Bobronium commented Sep 13, 2020

Change Summary

add smart_deepcopy():

smart_deepcopy(obj: ~Obj) -> ~Obj
    Return type as is for immutable built-in types
    Use obj.copy() for built-in empty collections
    Use copy.deepcopy() only on non-empty collections and unknown objects

It's primarily needed for faster copying of default values, since they're often are immutable types or empty collections

Here's benchmark showing the difference in speed between different types of values:

from copy import deepcopy
from timeit import timeit

from pydantic.utils import smart_deepcopy, BUILTIN_COLLECTIONS

cases = {
    "IMMUTABLE":
        (1, 1.0, '1', b'1', int, None, smart_deepcopy, len, smart_deepcopy.__code__, lambda: _, ...),
    "EMPTY_COLLECTIONS":
        map(lambda collection: collection(), BUILTIN_COLLECTIONS),
    "NON_EMPTY_COLLECTIONS":
        map(lambda c: c.fromkeys([1]) if issubclass(c, dict) else c([1]), BUILTIN_COLLECTIONS),
}

for name, values in cases.items():
    deepcopy_results = []
    smart_deepcopy_results = []
    for value in values:
        deepcopy_results.append(timeit(lambda: deepcopy(value), number=100000))
        smart_deepcopy_results.append(timeit(lambda: smart_deepcopy(value), number=100000))

    deepcopy_result = sum(deepcopy_results)
    smart_deepcopy_result = sum(smart_deepcopy_results)
    faster_by = deepcopy_result / smart_deepcopy_result
    print(f"{name}: {deepcopy_result=:.3}s, {smart_deepcopy_result=:.3}s, {faster_by=:.3} times")
IMMUTABLE: deepcopy_result=0.658s, smart_deepcopy_result=0.235s, faster_by=2.8 times
EMPTY_COLLECTIONS: deepcopy_result=1.73s, smart_deepcopy_result=0.261s, faster_by=6.62 times
NON_EMPTY_COLLECTIONS: deepcopy_result=2.33s, smart_deepcopy_result=2.5s, faster_by=0.933 times

Related issue number

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable
  • changes/<pull request or issue id>-<github username>.md file added describing change
    (see changes/README.md for details)

@codecov
Copy link

codecov bot commented Sep 13, 2020

Codecov Report

Merging #1920 into master will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master     #1920   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           21        21           
  Lines         3909      3916    +7     
  Branches       788       788           
=========================================
+ Hits          3909      3916    +7     
Impacted Files Coverage Δ
pydantic/fields.py 100.00% <100.00%> (ø)
pydantic/utils.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bf9cc4a...d8d89fa. Read the comment docs.

pydantic/utils.py Outdated Show resolved Hide resolved
Copy link
Member

@PrettyWood PrettyWood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice !

Bobronium and others added 2 commits September 13, 2020 18:47
Obj = TypeVar('Obj')


def smart_deepcopy(obj: Obj) -> Obj:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small food for thought that smart_deepcopy seems not particularly descriptive about what this function does, but I cannot come up with a better name so just offering food for thought in case someone else has an idea 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought about «fast_deepcopy», but function is actually a bit slower in case it has to deepcopy value, so «smart» seemed a better choice.

I agree, it’s kinda vague word, so would love hear any better suggestions :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with both, it's not a perfect name, but I can't think of anything better.

@samuelcolvin samuelcolvin merged commit d5e9d9a into pydantic:master Oct 8, 2020
@samuelcolvin
Copy link
Member

thanks so much, sorry I've taken so long to review this.

@samuelcolvin
Copy link
Member

I guess you can now update #1679

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants