Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pydantic changes json schema representation when extra keys are provided #3896

Closed
3 tasks done
dRacz3 opened this issue Mar 11, 2022 · 4 comments
Closed
3 tasks done
Labels
bug V1 Bug related to Pydantic V1.X

Comments

@dRacz3
Copy link

dRacz3 commented Mar 11, 2022

Checks

  • I added a descriptive title to this issue
  • I have searched (google, github) for similar issues and couldn't find anything
  • I have read and followed the docs and still think this is a bug

Bug

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

             pydantic version: 1.9.0
            pydantic compiled: True
                 install path: /Users/raczdaniel/sandbox/jupiterbox/.ve/lib/python3.10/site-packages/pydantic
               python version: 3.10.2 (main, Feb  8 2022, 20:07:01) [Clang 13.0.0 (clang-1300.0.29.30)]
                     platform: macOS-12.1-arm64-arm-64bit
     optional deps. installed: ['devtools', 'dotenv', 'typing-extensions']

Creating a nested class as such:

from pydantic import BaseModel, Field
import json

class RootClass(BaseModel):
    class NestedClassLevel(BaseModel):
        class NestedNestedClass(BaseModel):
            something: str
    
        foo : NestedNestedClass
    
    bar : NestedClassLevel

print(json.dumps(RootClass.schema(),indent=4))

will result in the following schema generated:

{
    "title": "RootClass",
    "type": "object",
    "properties": {
        "bar": {
            "$ref": "#/definitions/NestedClassLevel"
        }
    },
    "required": [
        "bar"
    ],
    "definitions": {
        "NestedNestedClass": {
            "title": "NestedNestedClass",
            "type": "object",
            "properties": {
                "something": {
                    "title": "Something",
                    "type": "string"
                }
            },
            "required": [
                "something"
            ]
        },
        "NestedClassLevel": {
            "title": "NestedClassLevel",
            "type": "object",
            "properties": {
                "foo": {
                    "$ref": "#/definitions/NestedNestedClass"
                }
            },
            "required": [
                "foo"
            ]
        }
    }
}

However, when an extra kwarg is assigned to the nested object foo:

class RootClassWithExtraKwargs(BaseModel):
    class NestedClassLevel(BaseModel):
        class NestedNestedClass(BaseModel):
            something: str
        # Diff is here, when adding an extra key to be included in the schema automatically
        foo : NestedNestedClass = Field(**{'this-key' : 'will remove $ref and replaces it with allOf'})
    bar : NestedClassLevel

print(json.dumps(RootClassWithExtraKwargs.schema(),indent=4))

This will result in the following schema:

The main issue with this, and find problematic, is that foo (NestedNestedClass) has been changed from a simple reference, to something that is allOf of something, which is list with a single element with the original reference?

{
    "title": "RootClassWithExtraKwargs",
    "type": "object",
    "properties": {
        "bar": {
            "$ref": "#/definitions/NestedClassLevel"
        }
    },
    "required": [
        "bar"
    ],
    "definitions": {
        "NestedNestedClass": {
            "title": "NestedNestedClass",
            "type": "object",
            "properties": {
                "something": {
                    "title": "Something",
                    "type": "string"
                }
            },
            "required": [
                "something"
            ]
        },
        "NestedClassLevel": {
            "title": "NestedClassLevel",
            "type": "object",
            "properties": {
                "foo": {
                    "title": "Foo",
                    "this-key": "will remove $ref and replaces it with allOf",
                    "allOf": [ # why was $ref pushed under the `allOf` key? Makes very little sense to use this structure
                        {
                            "$ref": "#/definitions/NestedNestedClass"
                        }
                    ]
                }
            },
            "required": [
                "foo"
            ]
        }
    }
}
@dRacz3 dRacz3 added the bug V1 Bug related to Pydantic V1.X label Mar 11, 2022
@dRacz3 dRacz3 changed the title Pydantic changes json schema representation when extra kez Pydantic changes json schema representation when extra keys arebp Mar 12, 2022
@dRacz3 dRacz3 changed the title Pydantic changes json schema representation when extra keys arebp Pydantic changes json schema representation when extra keys are provided Mar 12, 2022
@samuelcolvin
Copy link
Member

bug deep in schema.py I doubt there's an easy fix I'm afraid.

@m10d
Copy link

m10d commented Apr 17, 2022

I believe this is the same issue we are seeing after upgrading 1.7.3 (quite old) to 1.9.0. We use pydantic.dataclasses, but as I understand it, the OP's example (using BaseModel) would be equivalent to the following pydantic.dataclasses.dataclass field: foo: NestedNestedClass = field(metadata={"this-key":"will remove $ref and replaces..."})

If so, we see the identical problem, and it is causing major headaches as the output of BaseModel.schema() is used to generate OpenAPI schemas and further validation.

It seems like this is a bug (as @samuelcolvin noted), although in the seemingly related #2592 that OP seems to indicate it's expected.


For clarity, here's a quick runnable bugjar using pydantic.dataclasses and python3.10 that reproduces the same

"properties" : { "role" : { ... , "allOf": [ { "$ref": "#/definitions/UserRole"} is unexpected; the $ref should be listed directly in the role property dict ...

from dataclasses import field
import json
from enum import Enum, unique
from pydantic.dataclasses import dataclass


@unique
class UserRole(Enum):
    MANAGER = "manager"
    TECHNICIAN = "technician"
    VIEWER = "viewer"


@dataclass(frozen=True)
class User():
    """
    the following DocTest PASSES:
    but the AllOf occurence looks to be a bug
    
    >>> print(json.dumps(User.__pydantic_model__.schema(), indent=4))
    {
        "title": "User",
        "type": "object",
        "properties": {
            "role": {
                "description": "custom descr of UserRole field",
                "allOf": [
                    {
                        "$ref": "#/definitions/UserRole"
                    }
                ]
            },
            "email": {
                "title": "Email",
                "type": "string"
            }
        },
        "required": [
            "role",
            "email"
        ],
        "definitions": {
            "UserRole": {
                "title": "UserRole",
                "description": "An enumeration.",
                "enum": [
                    "manager",
                    "technician",
                    "viewer"
                ]
            }
        }
    }
    """
    role: UserRole = field(
        metadata=dict(
            description="custom descr of UserRole field",
        )
    )
    email: str

@m10d
Copy link

m10d commented Apr 17, 2022

@samuelcolvin we're upgrading from a very old version 1.7.3 - this is a change in behavior; is it known what pydantic commit introduced it?

Testing several recent releases it appears introduced between

  • 1.7.4 (old behavior) and
  • 1.8.2 (reproduces same bug)
  • Note that I'm running Python3.10 against both of these - ostensibly that isn't supported?

our update was driven by a necessary update to python3.10 - I think i already know the answer after reading #2885, but its worth asking if there is any available workaround for the moment.

Thank you!

@dmontagu
Copy link
Contributor

dmontagu commented Apr 27, 2023

Unless I'm misunderstanding something, this isn't actually a bug, but it is intended behavior — OpenAPI<3.1 doesn't allow sibling keys next to $ref, so if we were to not use the allOf: [{$ref: ...}] approach, it would end up removing the relevant keys in swagger UI.

Given that, and the fact that that pushing down into the single-item allOf is semantically equivalent in all drafts of JSON schema / OpenAPI versions I've seen, I don't think we will change this behavior by default. However, if it is a problem, we can consider improving the configurability to this in GenerateJsonSchema so as to eliminate this allOf-packing behavior if desired.

If you want to request we make it an option to generate JSON schema without pushing down $refs with sibling keys, please create a new issue for the feature request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V1 Bug related to Pydantic V1.X
Projects
None yet
Development

No branches or pull requests

4 participants