Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastAPI 0.67.0 new dataclass related features conflict with some aspects of new SQLAlchemy dataclass-style model mapping #3616

Closed
9 tasks done
AlexanderPodorov opened this issue Jul 27, 2021 · 17 comments

Comments

@AlexanderPodorov
Copy link

First Check

  • I added a very descriptive title to this issue.
  • I used the GitHub search to find a similar issue and didn't find it.
  • I searched the FastAPI documentation, with the integrated search.
  • I already searched in Google "How to X in FastAPI" and didn't find any information.
  • I already read and followed all the tutorial in the docs and didn't find an answer.
  • I already checked if it is not related to FastAPI but to Pydantic.
  • I already checked if it is not related to FastAPI but to Swagger UI.
  • I already checked if it is not related to FastAPI but to ReDoc.

Commit to Help

  • I commit to help with one of those options 👆

Example Code

from __future__ import annotations

from dataclasses import dataclass, field

from fastapi import Depends, FastAPI
from pydantic import BaseModel
from sqlalchemy import Column, ForeignKey, Integer, String, create_engine
from sqlalchemy.orm import Session, registry, relationship, sessionmaker
from uvicorn import run

SQLALCHEMY_DATABASE_URL = "sqlite:///./sql_app.db"

engine = create_engine(
    SQLALCHEMY_DATABASE_URL, connect_args={"check_same_thread": False}
)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)

mapper = registry()


@mapper.mapped
@dataclass
class User:
    __tablename__ = "users"
    __sa_dataclass_metadata_key__ = "sa"

    id: int = field(
        metadata={
            "sa": Column(Integer, primary_key=True, index=True)
        }
    )
    email: str = field(
        metadata={
            "sa": Column(String, unique=True, index=True)
        }
    )
    items: list[Item] = field(
        init=False,
        metadata={
            "sa": relationship("Item", lazy="raise")
        }
    )

    @property
    def greeting(self):
        return f"Hello {self.email}"


@mapper.mapped
@dataclass
class Item:
    __tablename__ = "items"
    __sa_dataclass_metadata_key__ = "sa"

    id: int = field(
        metadata={
            "sa": Column(Integer, primary_key=True, index=True)
        }
    )
    title: str = field(
        metadata={
            "sa": Column(String, index=True)
        }
    )
    owner_id: int = field(
        init=False,
        metadata={
            "sa": Column(Integer, ForeignKey("users.id"))
        }
    )


class UserIn(BaseModel):
    id: int
    email: str


class UserOut(BaseModel):
    id: int
    email: str
    greeting: str

    class Config:
        orm_mode = True


mapper.metadata.create_all(bind=engine)

app = FastAPI()


def get_db():
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()


@app.post("/users/", response_model=UserOut)
def create_user(user: UserIn, db: Session = Depends(get_db)):
    instance = User(
        id=user.id,
        email=user.email
    )
    db.add(instance)
    db.commit()
    return instance


if __name__ == '__main__':
    run(app)

Description

Example code works if fastapi==0.66.1 and it does not work if fastapi==0.67.0.
Open a browser, hit POST /users/ endpoint and create a user

Expected behaviour: User is created and displayed successfully.
Current behavior: User is created successfully (check in DB), but is not displayed successfully due to serialization errors.

Most likely the reason is in the new dataclass related features of FastAPI 0.67.0. Since the new-style database models are dataclasses as well, therefore they are affected too. As far as I can see if an instance is the dataclass, then FastAPI makes a dict (dataclasses.asdict(res)) out of instance before doing serialization.
It has two issues: first, if a dataclass has a property, it won't be serialized; second, if a dataclass has a relationship with lazy="raise" (means we should load this relationship explicitly), it is actually accessed and caused SQLAlchemy's exception.
Technically, if we let pydantic to serialize this dataclass, we won't get any of those issues. I assume it was the case for previous versions of FastAPI.

Here are a few links:
https://docs.sqlalchemy.org/en/14/orm/mapping_styles.html#orm-declarative-dataclasses-declarative-table
https://docs.sqlalchemy.org/en/14/orm/loading_relationships.html#prevent-lazy-with-raiseload

Operating System

Windows

Operating System Details

No response

FastAPI Version

0.67.0

Python Version

3.9.6

Additional Context

No response

@AlexanderPodorov AlexanderPodorov added the question Question or problem label Jul 27, 2021
@himbeles
Copy link
Contributor

I think this is caused by the same dataclass serialization issue as #3608 .

@himbeles
Copy link
Contributor

Could you check if my patch in PR #3607 fixes the issue for you?

@DomWeldon
Copy link

I have also experienced this issue this week. It caused a huge slowdown in generating the openapi schema in 0.64 but broke when I updated to 0.67.

The slowdown in schema generation seemed to come from pedantic.create_model() calling copy.deepcopy() on an object containing a sqlalchemy.Column object, which added 10s to my startup time.

A pydantic field has a field.field_info.extras property which contains your database field metadata. I simply called .pop(model.__sa_dataclass_metadata_key) for each field on my response_model, which fixed both the slowdown and the issue with the encoder in 0.67.

However I've been meaning to report this as a bug for the last day or two. A longer term fix is of course needed, but whether this is in FastAPI or Pydantic is maybe a matter for discussion.

@AlexanderPodorov
Copy link
Author

@himbeles , thank you! Unfortunately your fix won't work since in my case such a statement is problematic:

obj_dict =  dataclasses.asdict(obj)

It grabs all dataclass fields into the dict, so if we have a relationship field (see my example) with lazy='raise', then it will cause SQLAlchemy exception. I load my relationships differently depending on situation, to prevent DB load.

@AlexanderPodorov
Copy link
Author

I have no hope that this is going to be fixed someday, especially for reason that it could affect new dataclass related features.
So I will explicitly convert my dataclass DB models to pydantic models using BaseModel.from_orm method.

@Lunrtick
Copy link

I've run into the same issue, and have created a pull request that addresses it for me (PR #4094) - I'm not 100% sure about the approach though, let me know what you think.

@AlexanderPodorov
Copy link
Author

@Lunrtick , I think a bool flag like that is not a good solution. It would be nicer to call from_orm of pydantic somehow internally when the response is created. FastAPI 0.68.1 added support for read_with_orm_mode, to support SQLModel relationship attributes. I have not investigated it yet, but it seems it should have something to do with this new setting. Although, it's related to SQLModel, not to SQLAlchemy.

@Lunrtick
Copy link

@AlexanderPodorov I agree - it's definitely not the cleanest solution. I thought it'd make sense because we'd be opting in to a "legacy" feature, but it'd also mean FastAPI would need to support that going forward.

Perhaps we could emulate the from_orm behaviour you mention - I've updated my pull request so that a dataclass specifying a __read_with_orm_mode__ = True attribute would behave the same way as Pydantic models with the read_with_orm_option do.

There may be better ways to allow specifying this option though... Is it Pythonic to have a custom dunder property?

In my codebase, this change only requires me to update my base class, from which all my SQLAlchemy models inherit.

@AlexanderPodorov
Copy link
Author

@Lunrtick yes, I think it's a bit better. And yes, I think it's fine to have a custom dunder class attribute. SQLAlchemy relies on them too (__tablename__, __table_args__, etc.), so it don't break the pattern. I also think the solution must not depend on specific ORM, and should work fine for other ORMs. It seems we don't even need __read_with_orm_mode__ = True as you mentioned. Normally when you want to parse ORM object into pydantic model, this pydantic model must specify from_orm = True in the config class. So, if pydantic response_model is specified for a APIRoute, then FastAPI could try to parse it like this: model = response_model.from_orm(result), and do other steps based on whether the exception is thrown or not. Maybe I'm missing something... But I think it's up to @tiangolo to decide.

@gandhis1
Copy link

Is this ever going to get fixed? A rather big issue I would say

@sandys
Copy link

sandys commented Jul 8, 2022

hi guys,
is the pending PR for this issue going to be merged ? #4094

this has been affecting us as well

@tiangolo
Copy link
Owner

tiangolo commented Sep 4, 2022

Interesting, I would think that the main problem here would be that SQLAlchemy supports a bit of dataclasses, but it doesn't support dataclasses.asdict() and that's where it breaks.

So, I would think that the problem is not with dataclasses, and not necessarily even with FastAPI, but with the quirk that SQLAlchemy is doing its best to support dataclasses while not being made to do so, so it can only do so much, while still modifying the class and attributes in several ways.

I see there are some possible approaches, like the __read_with_orm_mode__ attribute, maybe a ClassVar. I don't see a predefined "standard" way of modifying dataclasses with additional metadata. But the problem is that there's no certainty or guarantee that SQLAlchemy won't break after adding more metadata in dataclass attributes or in some other way. So, there's no guarantee that adding support for a __read_with_orm_mode__ will keep working, it would be a workaround at best.

I'm not sure what's the best approach here. I hope to provide a way to configure how things are serialized and converted internally at some point, and that would allow customizing these things.

I also made SQLModel to solve this kind of problems and avoid code duplication: https://sqlmodel.tiangolo.com/ , you can also check if that works for you.


About just parsing directly from the response_model with model.from_orm, the problem is that response_model takes any type accepted by a Pydantic field, that means it also takes List[Item], and a list doesn't have model.from_orm.

@AlexanderPodorov
Copy link
Author

@tiangolo , thanks for your great work and FastAPI!

I don't see a good way of address this issue either.
AFAIK SQLAlchemy fully supports dataclasses related functionality, however there are two quirks with dataclasses.asdict which is used internally by FastAPI for serialization:

  • if a SQLA dataclass has a @property, it don't get serialized by dataclasses.asdict
  • if a SQLA dataclass has a relationship attribute declared with lazy="raise", it get serialized, however it should not in some cases.

I would avoid calling dataclasses.asdict for serialization. E.g. if response_model is Pydantic model, we could do Model.from_orm(obj) and then convert it to dict. In case of List[Item] we could do the same, but within list comprehension.

I'm now returning Pydantic models (parsed from SQLA dataclasses) from my endpoints.
Based on my example: instead of return instance, I'm doing return UserOut.from_orm(instance). And it works great.

@tiangolo
Copy link
Owner

Thanks for the input @AlexanderPodorov!

Yep, it seems SQLAlchemy supports several parts of dataclasses, but it seems it specifically doesn't support (at least not fully) dataclasses.asdict. And that precisely is what would be used in dataclasses to convert to dictionaries. It would break even outside of FastAPI.

I would avoid calling dataclasses.asdict for serialization. E.g. if response_model is Pydantic model, we could do Model.from_orm(obj) and then convert it to dict. In case of List[Item] we could do the same, but within list comprehension.

The problem with this is that it would work only for the specific case where the response_model is a Pydantic model, or where it is a list of models. But other cases wouldn't work. For example, a Union, a Dict of str to models, etc. So it would be mainly a workaround that would seem to work in some cases but not in others. I would prefer to avoid that.

I hope to, in some future release, allow customizing how to serialize return values. This would probably allow overriding this functionality to accommodate this particular use case, for example.

On the other hand, Pydantic V2 will have first class support for other types apart from models. Maybe that could help with these use cases too.

I guess it would also be nice to have support in SQLAlchemy for dataclasses.asdict when dataclasses are used, to have more complete support for dataclasses in general there, and avoid these small complications. Maybe there could be a config in the class saying what to serialize or something, to avoid the exception. I think that would probably be the right/best solution for this, as this is independent from FastAPI. But anyway, that's not really for me to say or decide there.

@AlexanderPodorov
Copy link
Author

Thanks for the comments @tiangolo !

SQLAlchemy actually supports dataclasses.asdict. The quirk here is that when we define a relationship attribute with lazy="raise" (we don't need this relationship to be loaded), it still gets affected and serialized by dataclasses.asdict even if it is not defined in response_model. Let's say our SQLAlchemy object is called obj, and response_model is Item. When we call Item.from_orm(obj) it works well since Pydantic does not access the attributes that were not defined in the model.

Yes, I understand there could be some complications with other use cases of jsonable_encoder.
I'm generally OK with my workaround return Item.from_orm(obj) instead of return obj. It seems more explicit: response_model and path operation return type are the same thing.

I'm doubtful about having some config class specifying what to serialize. Technically we could have the same object, but we want to serialize it differently in different situations.

If you want, I can close this issue or keep it open until we get better solution. Thanks!

@github-actions github-actions bot removed the answered label Sep 13, 2022
@joshtemple
Copy link

joshtemple commented Sep 23, 2022

I'm fairly new to SQLAlchemy, but I've been testing SQLAlchemy 2.0, and I've noticed that the new declarative model definitions result in model instances where dataclasses.is_dataclass(obj) == True, which has been causing recursion errors by trying to infinitely serialize relationships defined within those models.

So it seems this issue may affect all models in SQLAlchemy 2.0, not just dataclass models. I've had to comment out the code responsible for dict conversion so that my models can be properly parsed as Pydantic models with from_orm.

Edit: I was using the MappedAsDataclass mixin and had forgotten it, so disregard the above. This is not a new behavior in SQLAlchemy 2.0, just the way I was using the library.

@tiangolo
Copy link
Owner

Thanks for the comments! Yep, if you solved your use case you can close the issue @AlexanderPodorov 🤓

@tiangolo tiangolo reopened this Feb 27, 2023
Repository owner locked and limited conversation to collaborators Feb 27, 2023
@tiangolo tiangolo converted this issue into discussion #6524 Feb 27, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

8 participants