Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

each_item should iterate only at the top-level instead of going through nested items #1933

Closed
kevlarr opened this issue Sep 17, 2020 · 8 comments · Fixed by #1991
Closed
Labels
bug V1 Bug related to Pydantic V1.X

Comments

@kevlarr
Copy link

kevlarr commented Sep 17, 2020

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

             pydantic version: 1.5.1
            pydantic compiled: True
                 install path: ...snip...
               python version: 3.6.9 (default, Jun 26 2020, 13:24:16)  [GCC 4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.16)]
                     platform: Darwin-18.7.0-x86_64-i386-64bit
     optional deps. installed: ['typing-extensions']

I'm seeing what appears to be inconsistent behavior with each_item=True when I'm using a field that's a list of tuples.

For instance, a 'normal' model illustrates what seems to be logical behavior.

from typing import List, Tuple
from pydantic import BaseModel, validator

class Thing(BaseModel):
    things: List[str]

    @validator("things", each_item=False)
    def validate_all_things(cls, val):
        print(f"all vals: {val}")
        return val

    @validator("things", each_item=True)
    def validate_single_thing(cls, val):
        print(f"single val: {val}")
        return val


>>> Thing(things=["one", "two", "three,four"])

single val: one
single val: two
single val: three,four
all vals: ['one', 'two', 'three,four']
Thing(things=['one', 'two', 'three,four'])

But if the field is anything like List[Tuple[str], str]], then behavior is absolutely different.

class Thing2(BaseModel):
    things: List[Tuple[str, str]]

    @validator("things", each_item=False)
    def validate_all_things(cls, val):
        print(f"all vals: {val}")
        return val

    @validator("things", each_item=True)
    def validate_single_thing(cls, val):
        print(f"single val: {val}")
        return val


>>> Thing2(things=[("one", "two"), ("three", "four")])

single val: one
single val: two
single val: three
single val: four
all vals: [('one', 'two'), ('three', 'four')]
Thing2(things=[('one', 'two'), ('three', 'four')])

The output from validate_all_things makes perfect sense - it's just the entire list passed in.

I would expect, however, that each_item would do just what it implies, namely calling the validator on each item of the list no matter what the item is. Instead, it's operating on every value within each tuple within the list, and I cannot get it to perform in a way where val == ("one", "two").

Is this expected behavior? If so, why?

I mean, strings are iterable just like tuples, so why isn't each_item iterating over the characters of each string in the first case?

@PrettyWood
Copy link
Member

Hello @kataev
I agree the behaviour is not the expected one and for me this is a bug.

@PrettyWood PrettyWood added bug V1 Bug related to Pydantic V1.X and removed question labels Sep 18, 2020
@PrettyWood PrettyWood changed the title Why is each_item iterating through nested items instead of just the top-level list? each_item should iterate only at the top-level instead of going through nested items Sep 18, 2020
PrettyWood added a commit to PrettyWood/pydantic that referenced this issue Oct 11, 2020
When using a validator with `each_item`, the items are all validated
one by one. But if the items are also iterable the subitems would then
be validated because the validator would be kept as it is.
Now the validator passed to the items is changed and won't be propagated

closes pydantic#1933
@samuelcolvin
Copy link
Member

I can see it both ways. But my main problem is that this could well be a breaking change for some people who rely on this behaviour.

Doesn't this strictly speaking, need to wait for v2?

@PrettyWood
Copy link
Member

Good question @samuelcolvin. TBH I was so surprised when I saw this issue I really can't imagine this behaviour to be the expected one. Unfortunately it's always hard to make changes that could break someone case but I'm quite sure it's easily fixable.
When I think about it if we really want to be safe we could do what you suggested for improved unions: some kind of feature flag that would be all removed in v2.

@kevlarr
Copy link
Author

kevlarr commented Oct 26, 2020

@samuelcolvin Good point, I can actually see existing behavior be something that's relied on.. I'm normally not a fan of "just add another kwarg" so adding some keyword like validate_nested that defaults to False could work but might be a bit smelly... I dunno.

from typing import List
from pydantic import BaseModel, validator


class Thing(BaseModel):
    things: List[List[str]]

    # Existing behavior: operates on whole value for property
    @validator("things", each_item=False)
    def one(cls, val):
        assert len(val) > 1, "List must have multiple things in it"
        return val

    # Existing behavior: iterates through nested lists of property
    @validator("things", each_item=True)
    def two(cls, val):
        assert val, "Strings cannot be empty"
        return val

    # NEW BEHAVIOR: opt out of iterating through nested lists
    @validator("things", each_item=True, validate_nested=False)
    def three(cls, val):
        assert len(val) > 1, "Nested list must have multiple things in it"
        return val

# Invalid because of validator `one`
Thing(things=[["a", "b"]])

# Invalid because of validator `two`
Thing(things=[["a",  ""], ["c", "d"]])

# Invalid because of validator `three`
# Currently can't do this without iterating manually in a validator with `each_item=False`
Thing(things=[["a", "b"], ["c"]])

# Valid
Thing(things=[["a", "b"], ["c", "d"]])

@daviskirk
Copy link
Contributor

We've been bitten by this a few times.
It's especially tricky because people tend to check their validators with the expected behaviour (simple list for example) and it works as expected, and then get really confused by the bugs that occur when any sort of nested data is passed.
I can't really imagine anyone actually expecting this from the get go, but I guess people could rely on this behaviour after figuring it out.
If this can't be fixed until 2.0 perhaps it would be good to add a big warning in the docs on how this works?

Any resolution probably also resolves #1255 just to link the issue

@PrettyWood
Copy link
Member

PrettyWood commented Oct 27, 2020

Thanks for the other issue @daviskirk I linked it to my PR as well!
The more I think about it the more I'm convinced it's not the expected behaviour. But we could add an extra kwarg next to each_item like recursive or something else.
v1.8 would hence have a breaking change but the fix would be very easy for people relying on the recursive behaviour: just add this kwarg in the validator.
@samuelcolvin WDYT?

@kevlarr
Copy link
Author

kevlarr commented Oct 28, 2020

Ah, thanks for linking @daviskirk - I searched prior to adding issue but didn't see that one. Should we close mine @samuelcolvin?

@samuelcolvin
Copy link
Member

Sorry for the slow reply

Thanks for the other issue @daviskirk I linked it to my PR as well!
The more I think about it the more I'm convinced it's not the expected behaviour. But we could add an extra kwarg next to each_item like recursive or something else.
v1.8 would hence have a breaking change but the fix would be very easy for people relying on the recursive behaviour: just add this kwarg in the validator.
@samuelcolvin WDYT?

I think this is the most sensible solution. @PrettyWood if you update #1991 I'll merge it.

samuelcolvin pushed a commit that referenced this issue Feb 11, 2021
* fix: check only first sublevel for validators with `each_item`

When using a validator with `each_item`, the items are all validated
one by one. But if the items are also iterable the subitems would then
be validated because the validator would be kept as it is.
Now the validator passed to the items is changed and won't be propagated

closes #1933

* chore: add breaking change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V1 Bug related to Pydantic V1.X
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants