Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take description from docstring #222

Open
Peter9192 opened this issue May 13, 2024 · 3 comments
Open

Take description from docstring #222

Peter9192 opened this issue May 13, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@Peter9192
Copy link

Is your feature request related to a problem? Please describe.
I would like my dataclasses to be as concise and readable as possible. This makes them easier to maintain, especially for new/inexperienced developers.

Describe the solution you'd like
When you add a docstring to a class attribute, use that in the description field of the generated JSON schema. For example:

@dataclass
class SimpleRadiationConfig:
    Q0: float = 100
    """Fixed net radiation."""

print(build_json_schema(SimpleRadiationConfig).to_json())
{
    "type": "object",
    "title": "SimpleRadiationConfig",
    "properties": {
        "Q0": {
            "type": "number",
            "description": "Fixed net radiation.",
            "default": 100
        }
    },
    "additionalProperties": false
}

Describe alternatives you've considered
I saw #125, which achieves the same thing, but it requires the annotating all attributes as "fields".
I believe pydantic also supports this, but it requires marking all classes as pydantic.BaseModel, which feels more invasive.

Additional context

@Fatal1ty Fatal1ty added the enhancement New feature or request label May 13, 2024
@Fatal1ty
Copy link
Owner

Fatal1ty commented May 13, 2024

hi @Peter9192

I was thinking about it from the start but the inability to distinguish an automatically added technical docstring from an explicit one stopped me. Now I think that we can match the automatically generated docstring by its pattern. However, it’s worth adding a new builder parameter with the following possible values:

class DocStringDocumentation(StrEnum):
    FULL = "full"  # all docstrings will be used
    EXPLICIT_ONLY = "explicit_only"  # only explicitly added, will be by default
    NONE = "none"  # none of them

What do you think? You can help with the naming to speed up the work.

Edit: All this applies to the dataclass documentation but not to a certain field. If you know if pydantic adds a field documentation based on the docstring, please give me more info. All I know is there is no way to document a certain field with docstring in Python.

@Peter9192
Copy link
Author

Hi @Fatal1ty, thanks for the quick response! I wasn't aware of auto-generated docstrings. Can you clarify what you mean? I did find some docstring generators, but I believe you're referring to something else. Perhaps the options could be parse_docstrings with options all, none, and explicit_only. I think "explicit_only" is quite clear, can't think of better alternatives.

All this applies to the dataclass documentation but not to a certain field.

Just to be sure: by "dataclass documentation", do you mean only the top level docstring on the dataclass? I was hoping this would be possible also for fields, i.e. class variables with a type annotation (not those explicitly defined with the field function.

I may have been a bit too quick to conclude that pydantic supports this. However, I did find a recent PR that seems to add this functionality.

In the past I generated automatic API docs with autodoc, which led me to believe it should be possible to extract this info quite easily. However, it seems this doesn't discriminate fields from other class members, which was okay for my use case but may be too limiting for a generic implementation.

I believe mkdocstrings also parses field docstrings, see mkdocstrings/python#58

@Fatal1ty
Copy link
Owner

I wasn't aware of auto-generated docstrings. Can you clarify what you mean?

Sure, here it is:

from dataclasses import dataclass


@dataclass
class SimpleRadiationConfig:
    Q0: float = 100


print(SimpleRadiationConfig.__doc__)  # SimpleRadiationConfig(Q0: float = 100)

Just to be sure: by "dataclass documentation", do you mean only the top level docstring on the dataclass?

Yes, I mean the top level docstring because it's easy to get it from __doc__ attribute.

I may have been a bit too quick to conclude that pydantic supports this. However, I did find pydantic/pydantic#6563 that seems to add this functionality.

I see. They use ast module to parse the dataclass code. I'm not sure it's a good idea to invent a way to set field documentation in such non-standard ways that require parsing the code. I'm more inclined to use typing.Doc from PEP 727. It's not accepted so far but it's already included in typing-extensions.

from typing import Annotated, Doc

class User:
    name: Annotated[str, Doc("The user's name")]
    age: Annotated[int, Doc("The user's age")]

On the other hand, as you well noted, other tools use the comment after the field as documentation for it. It might make sense to come up with a way to connect plugins to JSON Schema generation, one of which would be to add documentation to comment-based fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants