Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing overriding of model names in get_model_name_map #6304

Open
3 of 13 tasks
aalexsmithh opened this issue Jun 29, 2023 · 6 comments · May be fixed by #6814
Open
3 of 13 tasks

Allowing overriding of model names in get_model_name_map #6304

aalexsmithh opened this issue Jun 29, 2023 · 6 comments · May be fixed by #6814

Comments

@aalexsmithh
Copy link

aalexsmithh commented Jun 29, 2023

Initial Checks

  • I have searched Google & GitHub for similar requests and couldn't find anything
  • I have read and followed the docs and still think this feature is missing

Description

I'd like to be able override the model names generated by get_model_name_map().

In the case of conflicts, often the verbose version (ref) of the model name is excessively verbose.

I'm using FastAPI to generate an OpenAPI spec of the app I'm building, which ultimately is being used to generate a client library for my service. It uses pydantic's get_model_name_map to generate model names to create the json spec. Because of a number of models have conflicting names, I end up with generated model names that are frustratingly verbose.

service/clients/clientA.py

clientA.py

class User(BaseModel):
  ...

service/clients/clientB.py

class User(BaseModel):
  ...

service/main.py

from service.clients import clientA, clientB

clientA.User()
clientB.User()

Because of the collision on the User model, when get_model_name_map() is run, I end up with model names like service__clients__clientA__User and service__clients__clientB__User. What I'd like to have is ClientAUser and ClientBUser.

I think it would be useful to allow overrides of this long model name to allow user-defined model titles when there are collisions.

Imagine perhaps a configuration like this, where the generated names are taken directly from the model config:

service/clients/clientA.py

clientA.py

class User(BaseModel):
  class Config:
    title = "ClientAUser"
  ...

Affected Components

Selected Assignee: @lig

@pydantic-hooky pydantic-hooky bot added the unconfirmed Bug not yet confirmed as valid/applicable label Jun 29, 2023
@lig lig closed this as completed Jun 30, 2023
@lig
Copy link
Contributor

lig commented Jun 30, 2023

@aalexsmithh thanks for the suggestion! It looks like a legitimate feature request to me 👍🏻.

Please, notice that Pydantic V1 won't receive new features anymore.

For now, I'd suggest using a custom class attribute, i.e. MyModel.__my_custom_model_name__ and a custom alternative to get_model_name_map() implementing the desired logic.

I think ConfigDict in Pydantic V2 could be extended with a new attribute, i.e. ConfigDict.model_name, to implement this in Pydantic V2.

Also, it seems like BaseModel.model_json_schema() could also benefit from this improvement.

@romulocollopy
Copy link

Hey, do you need help with this? I want to start contributing, especially with Rust code.

@lig
Copy link
Contributor

lig commented Jul 17, 2023

@romulocollopy Sure, go ahead!

@romulocollopy romulocollopy linked a pull request Jul 23, 2023 that will close this issue
5 tasks
@romulocollopy
Copy link

romulocollopy commented Jul 23, 2023

@lig, could you take a look at #6814 and see if it is moving in the right direction? If so, I will add documentation to the change.

This solution will change the cls.__name__ itself and may be too invasive. But I do not see a good reason to have the model_map or the model_json_schema["title"] different from the cls.__name__, given that it would be easily configurable.

In case there is a conflict in the set names, get_model_name_map will generate the long name. But Model.model_json_schema will use the set model_name. It seems like a desirable behaviour, but I am curious about your thoughts.

@dmontagu
Copy link
Contributor

I think the solution to this in pydantic v2 is to override pydantic.json_schema.GenerateJsonSchema.get_defs_ref in a custom subclass of GenerateJsonSchema, then pass that as the argument to model_json_schema.

We've considered adding functionality to allow types to more directly specify what the name/fallback-choices should be, but haven't settled on a preferred implementation. Would be happy to consider a PR modifying GenerateJsonSchema in some way to make this more readily possible though, but because you can subclass GenerateJsonSchema, technically it shouldn't be necessary to change this in a new pydantic release to get things working.

Worst case, if you are using a framework like FastAPI and can't modify the GenerateJsonSchema it uses (yet), you can probably get it to work by monkey-patching the pydantic.json_schema module by doing something like this:

import pydantic
from pydantic.json_schema import GenerateJsonSchema

...
class MyGenerateJsonSchema(GenerateJsonSchema):
    ...

pydantic.json_schema.GenerateJsonSchema = MyGenerateJsonSchema

@alexmojaki
Copy link
Contributor

It's not clear to me how a user is meant to override get_defs_ref without taking a close look at the source code, probably copying a lot of it, and relying on a protected attribute being stable. Surely their code has to look something like this?

class MyGenerateJsonSchema(GenerateJsonSchema):
    def get_defs_ref(self, core_mode_ref: CoreModeRef) -> DefsRef:
        my_def = ...  # insert complicated string parsing logic here
        full_def = super().get_defs_ref(core_mode_ref)
        self._prioritized_defsref_choices[full_def].insert(0, my_def)
        return full_def

Putting aside how a user might customize this, it seems that a lot could be done to reduce the need to customize. By default the choices jump from User to service__clients__clientA__User with nothing in between, ignoring the mode. I think they should also include clientA__User and clients__clientA__User. Then the final schema would probably have clientA__User and clientB__User to disambiguate. It's not quite as nice as ClientAUser but it's a whole lot better and closer than service__clients__clientA__User.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants