Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add private attributes support #1679

Merged
merged 26 commits into from Oct 26, 2020

Conversation

Bobronium
Copy link
Contributor

@Bobronium Bobronium commented Jul 2, 2020

Change Summary

Private attributes declared as regular fields, but always start with underscore and PrivateAttr is used instead of Field. Upon class creation they added in __slots__ and Model.__private_attributes__ for faster lookups and support for default values
Default values declared just like fields: _attr[: annotation] = default_value or _attr = PrivateAttr(default_value)

Example from docs

from datetime import datetime
from random import randint

from pydantic import BaseModel, PrivateAttr


class TimeAwareModel(BaseModel):
    _processed_at: datetime = PrivateAttr(default_factory=datetime.now)
    _secret_value: str = PrivateAttr()

    def __init__(self, **data):
        super().__init__(**data)
        self._secret_value = randint(1, 5)


m = TimeAwareModel()
print(m._processed_at)
print(m._secret_value)

Other changes

Replaced copy.deepcopy() usage with new pydantic.utils.smart_deepcopy() for faster immutable values handling

Related issue number

Closes #655

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable
  • changes/<pull request or issue id>-<github username>.md file added describing change
    (see changes/README.md for details)

@codecov
Copy link

codecov bot commented Jul 2, 2020

Codecov Report

Merging #1679 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #1679   +/-   ##
=======================================
  Coverage   99.90%   99.90%           
=======================================
  Files          21       21           
  Lines        4035     4051   +16     
  Branches      805      807    +2     
=======================================
+ Hits         4031     4047   +16     
  Misses          3        3           
  Partials        1        1           
Impacted Files Coverage Δ
pydantic/class_validators.py 100.00% <100.00%> (ø)
pydantic/fields.py 100.00% <100.00%> (ø)
pydantic/schema.py 100.00% <100.00%> (ø)
pydantic/utils.py 100.00% <100.00%> (ø)
pydantic/generics.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5bfee87...104d600. Read the comment docs.

@Bobronium
Copy link
Contributor Author

Bobronium commented Jul 3, 2020

@samuelcolvin, seems like test_constrained_set_default introduced in dca9855 was broken somehow. #1627 (comment)

@Benoss
Copy link

Benoss commented Aug 24, 2020

I quite like this way of doing it

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry it's been so long, I've been really busy.

This looks amazing, but might need to be split into multiple PRs.

docs/usage/models.md Outdated Show resolved Hide resolved
bool,
bytes,
type,
type(None),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type(None),
NoneType,

(imported from ./typing.py)

)

ROOT_KEY = '__root__'
IMMUTABLE_NON_COLLECTIONS_TYPES: AbstractSet[Type[Any]] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IMMUTABLE_NON_COLLECTIONS_TYPES: AbstractSet[Type[Any]] = {
IMMUTABLE_NON_COLLECTIONS_TYPES: Set[Type[Any]] = {

Any reason not to use normal Set here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth adding a comment explaining this? Also maybe IMMUTABLE_TYPES would be a better name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth adding a comment explaining this?

Sure, I commented usage of this constant. Should I describe it alone here?

Also maybe IMMUTABLE_TYPES would be a better name?

Well, tuple() is immutable type and yet could not be included in this set, because it can contain mutable types, so I thought it's worth clarifying

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -94,7 +112,7 @@ def validate_field_name(bases: List[Type['BaseModel']], field_name: str) -> None
Ensure that the field's name does not shadow an existing attribute of the model.
"""
for base in bases:
if getattr(base, field_name, None):
if hasattr(base, field_name):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove this change, it's unrelated.

Obj = TypeVar('Obj')


def smart_deepcopy(obj: Obj) -> Obj:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks very cool, but are we 100% sure it always works? Or just works in all the cases we know of?

Also, this is great, but might be best to implement it as a separate PR, it's not really related to the core change here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, separate PR is more sutable for this change.

Or just works in all the cases we know of?

Good catch. I think implementing faster behaviour only for built-in collections and immutable types is a way to go.

pydantic/main.py Outdated
private_attributes: Dict[str, Any] = dict.fromkeys((slots,) if isinstance(slots, str) else slots, Undefined)
validate_private_attributes(private_attributes)

for base in reversed(bases) if _is_base_model_class_defined else ():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this just a cosmetic change? if so could you remove it to make this change easier to review, if not I think it belongs in another PR.

Copy link
Contributor Author

@Bobronium Bobronium Sep 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, unfortunately I don't remember. Should've put a comment here 🤦 . But from what I see, now it includes BaseModel and it didn't before. I'll try to remember the reason I did it this way and either put explaning comment, or change the line back.

pydantic/main.py Outdated
@@ -303,7 +306,8 @@ def __new__(mcs, name, bases, namespace, **kwargs): # noqa C901
'__schema_cache__': {},
'__json_encoder__': staticmethod(json_encoder),
'__custom_root_type__': _custom_root_type,
**{n: v for n, v in namespace.items() if n not in fields},
'__private_attributes__': private_attributes,
**{n: v for n, v in namespace.items() if n not in fields | private_attributes.keys()},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

humm, does this create fields | private_attributes.keys() on every iteration?

Either way, might be clearer to define it beforehand.

pydantic/main.py Outdated
object_setattr(self, '__fields_set__', state['__fields_set__'])
self._set_private_attributes(state['__private_attributes_values__'], need_copy=False, check=False)

def _set_private_attributes(self, source: 'DictStrAny', need_copy: bool = True, check: bool = True) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the two kwargs are always used together, maybe they can be combined into check_copy or from_pickle=False?

This method might need a docstring too.

pydantic/main.py Outdated
if value is not Undefined:
object_setattr(self, name, smart_deepcopy(value) if need_copy else value)
elif check and not hasattr(self, name):
raise AttributeError(f'private attribute "{name}" is unset')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe if check is True we should do the check first so the exception message is always the same?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or make it identical here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't get this one, could you explain?

return max(storage) + 1 if storage else 0


class BaseStorageModel(BaseModel):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the _processed_at example at the top of #655 is simpler and clearer?

Co-authored-by: Samuel Colvin <samcolvin@gmail.com>
Copy link
Contributor Author

@Bobronium Bobronium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelcolvin, thanks for review, I answered some comments and think I'll be able to split PR on the next weekend.

)

ROOT_KEY = '__root__'
IMMUTABLE_NON_COLLECTIONS_TYPES: AbstractSet[Type[Any]] = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth adding a comment explaining this?

Sure, I commented usage of this constant. Should I describe it alone here?

Also maybe IMMUTABLE_TYPES would be a better name?

Well, tuple() is immutable type and yet could not be included in this set, because it can contain mutable types, so I thought it's worth clarifying

Obj = TypeVar('Obj')


def smart_deepcopy(obj: Obj) -> Obj:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, separate PR is more sutable for this change.

Or just works in all the cases we know of?

Good catch. I think implementing faster behaviour only for built-in collections and immutable types is a way to go.

pydantic/main.py Outdated
object.__setattr__(self, '__fields_set__', state['__fields_set__'])
object_setattr(self, '__dict__', state['__dict__'])
object_setattr(self, '__fields_set__', state['__fields_set__'])
self._set_private_attributes(state['__private_attributes_values__'], need_copy=False, check=False)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelcolvin, maybe we should use dict.get() here, so unpickling models dumped with older versions of pydantic would work?

Suggested change
self._set_private_attributes(state['__private_attributes_values__'], need_copy=False, check=False)
self._set_private_attributes(state.get('__private_attributes_values__', {}), need_copy=False, check=False)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes makes sense.

pydantic/main.py Outdated
private_attributes: Dict[str, Any] = dict.fromkeys((slots,) if isinstance(slots, str) else slots, Undefined)
validate_private_attributes(private_attributes)

for base in reversed(bases) if _is_base_model_class_defined else ():
Copy link
Contributor Author

@Bobronium Bobronium Sep 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, unfortunately I don't remember. Should've put a comment here 🤦 . But from what I see, now it includes BaseModel and it didn't before. I'll try to remember the reason I did it this way and either put explaning comment, or change the line back.

samuelcolvin pushed a commit that referenced this pull request Oct 8, 2020
* add smart_deepcopy

* uncomment tuple in BUILTIN_COLLECTIONS, fix doc a bit

* Fix grammar

Co-authored-by: PrettyWood <em.jolibois@gmail.com>

* replace map() usage with generator comprehension, fix comment

Co-authored-by: PrettyWood <em.jolibois@gmail.com>
Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most things look good here, met me know if I've missed any questions.

pydantic/main.py Outdated
object.__setattr__(self, '__fields_set__', state['__fields_set__'])
object_setattr(self, '__dict__', state['__dict__'])
object_setattr(self, '__fields_set__', state['__fields_set__'])
self._set_private_attributes(state['__private_attributes_values__'], need_copy=False, check=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes makes sense.

)

ROOT_KEY = '__root__'
IMMUTABLE_NON_COLLECTIONS_TYPES: AbstractSet[Type[Any]] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@Bobronium
Copy link
Contributor Author

Bobronium commented Oct 21, 2020

I was procrastinating about this PR recently, as I thought that it might be worth adding support of default_factory for private fields, as long as other fields features maybe. But don't see any obvious way to do it. Maybe we should think it throw better before sticking with final implementation?

For example, this will give error about _processed_at was not set, because code where it's set is executed after BaseModel.__init__

from pydantic import BaseModel
from datetime import datetime

class TestExtra(BaseModel):
    a: int
    __slots__ = ('_processed_at',)
    _processed_at: datetime

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._processed_at = datetime.utcnow()

TestExtra(**{"a": 1})

I would prefer having something like this: (__slots__ are automatically constructed in ModelMetaclass.__new__, value is generated in BaseModel.__init__, or in validate_model)

from pydantic import BaseModel, PrivateField
from datetime import datetime

class TestExtra(BaseModel):
    a: int
    _processed_at: datetime = PrivateField(default_factory=datetime)  # or Field(private=True, ...

TestExtra(**{"a": 1})

@samuelcolvin
Copy link
Member

automatically generating __slots__ would be great, I think this would make it much easier to use.

@Bobronium
Copy link
Contributor Author

What do you think about PrivateField? Should it subclass regular field class, be an independent class, or it's better to use Field(private=True)?

# Conflicts:
#	pydantic/main.py
#	pydantic/utils.py
#	tests/test_utils.py
@samuelcolvin
Copy link
Member

I think probably easiest if it's a regular class.

I think we should keep it separate from Field because the point is that these attributes are NOT fields.


@no_type_check
def __setattr__(self, name, value):
def __setattr__(self, name, value): # noqa: C901 (ignore complexity)
Copy link
Contributor Author

@Bobronium Bobronium Oct 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had to set # noqa: C901 (ignore complexity) here because of today's changes in #1972. Is this the way to go, or something should be moved outside of this method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is fine.

@samuelcolvin
Copy link
Member

@MrMrRobat this is looking good, do you think it will be ready to merge today?

I want to get v1.7 released, but I'm happy wait until tomorrow if we can include this?

@Bobronium
Copy link
Contributor Author

do you think it will be ready to merge today?

Yeah, if don't have any comments or suggestions, it's ready :)

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, just a few things to clarify.

pydantic/main.py Outdated
object.__setattr__(__pydantic_self__, '__fields_set__', fields_set)
object_setattr(__pydantic_self__, '__dict__', values)
object_setattr(__pydantic_self__, '__fields_set__', fields_set)
__pydantic_self__._set_default_private_attributes(__pydantic_self__.__private_attributes__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean exactly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid this line since this is the critical path in terms of performance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sets defaults of __private_attributes__, without this line default not default_factoru wouldn't work.
What's critical here in terms of performance? Function call itself or executing code of the function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I guess we need it.

Initialising/validating models is the critical path, I don't want to slow that down unless we absolutely have to.

__private_attr__: str = 'private attr value'

class Config:
underscore_attrs_are_private = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I this needs documenting in model_config.md. Also we need to be clear exactly what form names need to be in.

It looks from this like if you use PrivateAttr you can use _whatever, but with underscore_attrs_are_private you have to use __whatever__, but reading the code, I think that's not the case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually you've written it clear in the copy, but still better to use the same format in both code examples.

Is there anything special about __whatever__ or is it just that it starts with an underscore, like _whatever?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I think according to some PEP we're not supposed to use dunder methods, unless they're official to python, maybe leave them out of the example and just mention them in the text?

docs/examples/private_attributes.py Show resolved Hide resolved
pydantic/main.py Outdated Show resolved Hide resolved
Co-authored-by: Samuel Colvin <samcolvin@gmail.com>
Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is looking amazing, thank you @MrMrRobat.

I've tried it myself and i think it'll be extremely useful.

I have one small proposed change in name, but only other question is on model.copy() which currently doesn't copy private attributes, but instead initialises from the default or default factory. I think I would expect copy() to (deep?)copy the private attributes, a bit like __setstate__.

What do you think?

pydantic/main.py Outdated Show resolved Hide resolved
Co-authored-by: Samuel Colvin <samcolvin@gmail.com>
@samuelcolvin
Copy link
Member

you need to reference the new name, sorry I should have made that clear in my change proposal.

Will you change it or will I?

@Bobronium
Copy link
Contributor Author

Bobronium commented Oct 26, 2020

I think I would expect copy() to (deep?)copy the private attributes, a bit like setstate.

Yeah, you're right. Working on it.

@Bobronium
Copy link
Contributor Author

Working on it.

Done

@samuelcolvin samuelcolvin merged commit 664cbcf into pydantic:master Oct 26, 2020
@samuelcolvin
Copy link
Member

awesome, thank you.

I'm really looking forward to using this.

luckydonald added a commit to luckydonald/fastorm that referenced this pull request Oct 27, 2021
@samuelcolvin
Copy link
Member

Question on this in #4518.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Question: add private attribute
3 participants