Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to bridge Pydantic models with SQLAlchemy? #214

Closed
sm-Fifteen opened this issue May 10, 2019 · 33 comments
Closed

How to bridge Pydantic models with SQLAlchemy? #214

sm-Fifteen opened this issue May 10, 2019 · 33 comments
Labels
question Question or problem question-migrate

Comments

@sm-Fifteen
Copy link
Contributor

Description

Up until database interaction is introduced, the tutorial for FastAPI uses pydantic models for everything, such as this example in the sextion on Extra Models :

class UserOut(BaseModel):
    username: str
    email: EmailStr
    full_name: str = None


class UserInDB(BaseModel):
    username: str
    hashed_password: str
    email: EmailStr
    full_name: str = None

This alows for the "database model" to have private data which will not be exposed through the API. Later, in the section on security, a similar trick is used, but this time using inheritance to stack the two models (which I find makes the return-casting used by some functions better encoded in the type system).

class User(BaseModel):
    username: str
    email: str = None
    full_name: str = None
    disabled: bool = None


class UserInDB(User):
    hashed_password: str
def get_db_user() -> UserInDB:
    return UserInDB(
        username="johndoe", full_name="John Doe",
        email="johndoe@example.com",
        hashed_password="fakehashedsecret",
        disabled=False
    )

def get_user() -> User:
    return get_db_user()

However, when proper databases are introduced, those Pydantic models are dropped in favor of a single SQLAlchemy ORM model, with no effort to bridge the two parts. And while one could see this as the SQLAlchemy models completely superseding the Pydantic models, the fullstack demo app appears to actually use both, so there appears to be value in attempting to use them together, something which the documentation doesn't seem to address.

So can/should Pydantic and SQLAlchemy models be used together? If they are, how is one meant to connect the two together and can this still be done while maintaining some kind of type hierarchy?

@sm-Fifteen sm-Fifteen added the question Question or problem label May 10, 2019
@dmontagu
Copy link
Collaborator

dmontagu commented May 10, 2019

From what I can tell, there is no problem using both pydantic models and SQLAlchemy models, and if you want to use sqlalchemy as your ORM you will need to use both.

I've found this a little frustrating because, at least in my experience, it leads to lots of similarly-named objects and repetition of definitions, and frequent translation between the types depending on how you are using them. @tiangolo I would be interested if you have any suggestions for design patterns that would reduce the amount of "translation" code associated with having separate pydantic and sqlalchemy models.

More generally, I would be interested to hear if anyone has found a clean approach to bridging the gap between pydantic and a database, especially if it integrates nicely with an asynchronous database driver (I've been using (encode/databases), which only supports SQLAlchemy core, not the ORM). In particular, it would be nice if I only had to create a single class for each conceptual model (I'd be okay with separate classes for pydantic and the orm as long as they could be derived from a common root).

My current approach has been to create a container class that holds references to the database table and the appropriate creation/in-db pydantic models (similar to UserCreate and UserInDB from the docs), and has a variety of generic classmethods for CRUD that make use of the creation/in-db types for input / return, but it still feels like a hack.

@ebreton
Copy link

ebreton commented May 11, 2019

Hi @dmontagu ,

I am interested in your 'hack' 😄 Could you share some code sample ?

On my side, I find it useful to split the models used for frontend interaction (Pydantic) from those used for the DB interaction and business logic (SQLAlchemy).

I have therefore followed the scaffold from https://github.com/tiangolo/full-stack-fastapi-postgresql, with some repetition between models and db_models. But again, I am fine with it since it adresses different purpose. The point where repetition becomes to cumbersome is the crud layer, where you basically have to copy-paste again and again the same pieces of code. I have made a PR to try to reduce this friction: tiangolo/full-stack-fastapi-template#23

I am curious to see how this goes along with @dmontagu approach...

By the way, fastapi takes care to translate SQLAlchemy models to pydantic ones on the api endpoint layer, with two limitations I have found so far on JSON fields (see #211 for details) and Enum fields in a specific use case (see #196)

@tiangolo
Copy link
Owner

So, Pydantic and SQLAlchemy are separated. Pydantic is used for documentation, validation, and data serialization. SQLAlchemy for SQL ORM stuff.

FastAPI is not coupled with any DB, so, SQLAlchemy is optional.

If you don't care about having documentation, validation, and serialization, you don't need to use Pydantic, you could return SQLAlchemy models directly. They would be converted to JSON in a "best-effort" way.

If you want both, for now, you have to write both.

I agree I don't like the duplication of properties in classes. But there's still no way to automatically generate one from the other.

That's a good candidate for a third-party package, that generates Pydantic models from SQLAlchemy models (or other ORMs) automatically. But it doesn't exist yet. There are some attempts at doing something similar but I don't know a complete solution yet. Hopefully, someone will build it (or I'll do it later).


For now, I'll take it as a request to update the docs clarifying the use of both Pydantic and SQLAlchemy at the same time.

@euri10
Copy link
Contributor

euri10 commented May 11, 2019

If you want another inspiration there's also https://github.com/thomaxxl/safrs

@tiangolo
Copy link
Owner

Thanks @euri10 .

@ebreton
Copy link

ebreton commented May 11, 2019

If you want another inspiration there's also https://github.com/thomaxxl/safrs

Thanks for the link @euri10 . I will stick with fastapi and the separation for now 💕

@sm-Fifteen
Copy link
Contributor Author

For now, I'll take it as a request to update the docs clarifying the use of both Pydantic and SQLAlchemy at the same time.

Yeah, that was my intention, the doc jumps from one to the other with little to no transition and it's really glaring considering how of the rest of the doc tends to flow together.

@tiangolo
Copy link
Owner

Great! I'll update it.

@dmontagu
Copy link
Collaborator

dmontagu commented May 12, 2019

+1 to clarifying the docs about how the pydantic models and sqlalchemy models should be used together, I also found this point a little confusing at first.

@ebreton I've been iterating on my approach, and due to weaknesses in some combination of 1) the python typing system, 2) pycharm's implementation of type checking, and 3) my understanding of python's typing system, I've modified my approach to make heavier use of instances for the sake of better-working auto-completion. In particular, it looks like generic class variables aren't supported in the python type hinting system, so I modified my approach so that the "container class" table is now actually just a separate instance for each separate model.

Here's a simplified version of my implementation (removing some complexity due to things like automatically generated uniqueness constraints)

import uuid
from typing import Generic, List, Tuple, Type, TypeVar

import sqlalchemy as sa
from databases import Database
from pydantic import BaseModel
from sqlalchemy.dialects.postgresql import UUID
from sqlalchemy.sql import Insert


def get_metadata() -> sa.MetaData:
    ...  # returns my configured sqlalchemy.MetaData object


def get_database() -> Database:
    ...  # returns my configured databases.Database object


class BaseColumns:
    def values(self) -> List[sa.Column]:
        """
        Mirrors the `values` function on the table.c object from sqlalchemy
        """
        return list(self.__dict__.values())


T = TypeVar("T", bound=BaseColumns)


class TypeHintedTable(sa.Table, Generic[T]):
    """ Only intended for use as a typehint, NOT to be actually instantiated """

    @property
    def c(self) -> T:
        return NotImplemented

    @property
    def columns(self) -> T:
        return NotImplemented


ColumnsT = TypeVar("ColumnsT", bound=BaseColumns)
CreationModelT = TypeVar("CreationModelT", bound=BaseModel)
DatabaseModelT = TypeVar("DatabaseModelT", bound=BaseModel)


class PydanticTable(Generic[ColumnsT, CreationModelT, DatabaseModelT]):
    def __init__(self,
                 *,
                 table_name: str,
                 columns: ColumnsT,
                 creation_model: Type[CreationModelT],
                 db_model: Type[DatabaseModelT],
                 ):
        assert issubclass(creation_model, BaseModel)  # needed by pycharm, despite bound
        assert issubclass(db_model, BaseModel)  # needed by pycharm, despite bound

        self.columns: ColumnsT = columns
        self.creation_model: Type[CreationModelT] = creation_model
        self.db_model: Type[DatabaseModelT] = db_model
        self.table_name = table_name

        self.table: TypeHintedTable[ColumnsT] = self._get_table()

    def _get_table(self) -> TypeHintedTable[ColumnsT]:
        # In my actual implementation, I also generate uniqueness constraints
        table_args = self.columns.values()
        return sa.Table(self.table_name, get_metadata(), *table_args)

    # Some example CRUD methods (actual implementation has more)
    async def read_by_columns_values(
            self, *, columns: List[sa.Column], values: List[Tuple]
    ) -> List[DatabaseModelT]:
        assert all(len(value) == len(columns) for value in values)
        query = self.table.select().where(sa.tuple_(*columns).in_(values))
        result = await get_database().fetch_all(query)
        return [self.db_model(**x) for x in result]

    async def create(
            self, creation_requests: List[CreationModelT]
    ) -> List[DatabaseModelT]:
        # In my actual implementation, I also handle uniqueness conflicts
        insertion_values = [item.dict() for item in creation_requests]
        insertion_query = Insert(
            self.table,
            values=insertion_values,
            returning=self.table.c.values()
        )
        result = await get_database().fetch_all(insertion_query)
        return [self.db_model(**x) for x in result]

An example of instantiating the relevant classes for a particular conceptual model would then look like this:

class Product(BaseModel):
    brand: str
    name: str
    description: str


class ProductCreate(Product):
    pass


class ProductInDB(ProductCreate):
    product_id: uuid.UUID


class ProductColumns(BaseColumns):
    def __init__(self):
        self.product_id = sa.Column(
            "product_id", UUID, primary_key=True,
            server_default=sa.text("uuid_generate_v4()")
        )
        self.brand = sa.Column("brand", sa.String, nullable=False)
        self.name = sa.Column("name", sa.String, nullable=False)
        self.description = sa.Column("description", sa.String, nullable=False)


columns = ProductColumns()
ProductORM = PydanticTable(
    table_name="products",
    columns=ProductColumns(),
    creation_model=ProductCreate,
    db_model=ProductInDB,
)

ProductORM or the equivalent would then be imported anywhere you wanted to access the associated sqlalchemy table or the various crud methods. For example, if you have a fastapi endpoint that accepts, let's say, a list of ProductCreate as a body argument, the endpoint body can just be return await ProductORM.create(product_create_list) and it will return a list of the created ProductInDB.

In my case, I also wanted the ability to easily build more complex sqlalchemy core queries involving multiple models, and the reason for the BaseColumns + TypeHintedTable approach is that I can autocomplete to the columns in the table when I write ProductORM.table.c. (I haven't seen a way to get sqlalchemy to do this that doesn't involve heavily depending on the sqlalchemy ORM). I also have an unbound version of the read_by_columns_values method that lets me take the result of a more complex query and convert the result into pydantic models.

This approach works fine for my use case because my models are all essentially immutable, but I think most people will just find it easier to stick with sqlalchemy's ORM. In my case, I don't need to handle a very large number of distinct model-querying patterns and I find it a little easier to reason through the database accesses and associated performance implications when I'm closer to the raw sql, so I chose not to depend on sqlalchemy ORM. (But I would probably still use it though if it had improved async compatibility.)

@tiangolo
Copy link
Owner

A quick note, I'm working on a PR in Pydantic here: pydantic/pydantic#520

To improve the way we can interact with SQLAlchemy models and Pydantic, to solve several lazy-loading issues, loading relationships (when needed), handling lists of results, etc.

I'm also updating the SQLAlchemy tutorial with all those changes, to show how to use both SQLAlchemy and their role. And the same documentation scripts are being used for the tests.

But this work on the tutorial is on hold for a bit until the PR is merged.

@trim21
Copy link
Contributor

trim21 commented May 20, 2019

I read the conversation in your PR, but I'm not formalir with pydantic. does it mean that if I implement a __iter__ on my orm class, returning a instance in handler should work fine even without your PR merged?
I'm wrong

@bartvanesWB
Copy link

Looks like pydantic/pydantic#562 has been merged.

Any updates on this topic? FastAPI combined with nested models and sqlalchemy database functionality would be a great plus! Thanks for the good work so far, very impressing!

@tiangolo
Copy link
Owner

I just finished integrating Pydantic ORM mode into FastAPI, it is released as version 0.30.0 🎉

The new docs include how to use Pydantic with SQLAlchemy, how relationships work, etc: https://fastapi.tiangolo.com/tutorial/sql-databases/

@sm-Fifteen
Copy link
Contributor Author

Stellar! Thanks @tiangolo.

That pretty much solves the issue as far as I'm concerned.

@tiangolo
Copy link
Owner

Great! Thanks for reporting back and closing the issue @sm-Fifteen .

@acnebs
Copy link

acnebs commented Nov 18, 2019

Is it possible to create some sort of derived BaseModel that has orm_mode always set to true? Adding the Config subclass for every one of my models is not very DRY.

@euri10
Copy link
Contributor

euri10 commented Nov 18, 2019

yes it is, you just create your base class that has all what you want (in your case in particular orm_mode) then in all subsequent classes you inherit from that base class

@dmontagu
Copy link
Collaborator

dmontagu commented Nov 18, 2019

Yes @acnebs I do that myself, along with some other config settings and utility methods.

@kolypto
Copy link

kolypto commented Sep 13, 2020

I've developed a library that meticulously converts an SqlAlchemy model into Pydantic. It even supports relationships :)

https://github.com/kolypto/py-sa2schema

@collerek
Copy link

collerek commented Nov 2, 2020

Since I was tired of reinventing the wheel and needed something to tackle exactly the same problem I created ormar package - an async mini orm with fastapi in mind and pydantic validation. Ormar was inspired by encode/orm package which seems either stale or abandoned (and uses typesystemanyway).

Ormar bases its validation on pydantic so it can be used directly with fastapi as response and request models.

Feel free to check it out: https://github.com/collerek/ormar

@Kludex
Copy link
Sponsor Collaborator

Kludex commented Nov 2, 2020

@collerek why "mini"?

@collerek
Copy link

collerek commented Nov 2, 2020

@Kludex Cause it does not keep in memory any information regarding already instantiated objects so if you do something like this:

record1 = await Model.objects.get(pk=1)
record2 = await Model.objects.get(pk=1)

assert record1==record2
# True - but in reality they are 2 separate python objects that knows nothing about each other
# so if you do:
record1.parameter = 'something else'
await record1.save()
# the record2 does not update - you won't have errors like in sqlalchemy orm modifing two times the same objects etc.

So it behaves pretty much like i.e. gino orm.

Also starting in August the scope was much smaller (no many2many relations etc.), this has changed but the "mini" in name prevails.

When I compare to full fledged ORMs like django, sqlalchemy etc. it looks and feels tiny 😄

@kuwv
Copy link
Contributor

kuwv commented Jan 4, 2021

@sm-Fifteen also, https://pydantic-docs.helpmanual.io/usage/models/#orm-mode-aka-arbitrary-class-instances

@sm-Fifteen
Copy link
Contributor Author

@sm-Fifteen also, https://pydantic-docs.helpmanual.io/usage/models/#orm-mode-aka-arbitrary-class-instances

This was added to pydantic in response to this issue, see #214 (comment).

@ghost
Copy link

ghost commented Mar 5, 2021

Hi! I am also looking for some lib that could integrate SQLAlchemy and Pydantic. Anyone knows anything please let us know, too. Thanks.

@ycd
Copy link
Contributor

ycd commented Mar 5, 2021

@looselytiedshoelaces fyi see: #2194, probably related to your request.

@sm-Fifteen
Copy link
Contributor Author

Now that SQLAlchemy 1.4 is out, one possible way to solve that duplication issue would be to declare only one model per type using both the new dataclass support for SQLA and Pydantic's built-in dataclass support together. I haven't tested it myself yet, and the result would probably be "a bit" verbose, but it's at least worth looking into.

@miquelvir
Copy link

@sm-Fiffeen has anyone?

@tommytwoeyes
Copy link

tommytwoeyes commented Apr 18, 2021

Hello every body ...

I realize this has been resolved, and that even though I read the entire thread before posting this, someone has likely already pointed out the observation about which I'm writing.

Still, because it took me quite a while before the relationship between Pydantic models and typical ORM/ODM models [e.g. SQLAlchemy, tortoise, etc.--I really like the remodel package, for RethinkDB, btw ;-) ], I'm going to post my take on this concern:

It is mentioned in the docs (perhaps as a result of this discussion--I did not check), that the models Pydantic provides are meant to model Request (Body) and Response (Body); i.e. the standard objects used in many/most HTTP libraries, regardless of programming language.

This still leaves the issue of having a complex object model to keep in mind. I haven't tried Pydantic's orm_mode yet, but am looking forward to it. If nothing else, I hope my post helps someone, because it did help me assimilate FastAPI's concepts when I understood that the way Pydantic models were used in FastAPI (as demonstrated in the docs and discussed here ad nauseam) was not some entirely new concept, but an innovative and powerful new way to build Request and Response objects, with automatic validation and OpenAPI-compliant documentation built in.

@johnthagen
Copy link
Contributor

I came here looking to find a replacement for DjangoRESTFramework's ModelSerializer, which can be derived from a Django Model and avoid the need to duplicate any fields/names between the database representation and the serializer/API.

I've begun to use ormar, and it solves this need perfectly for FastAPI, at least for the somewhat simple case I'm using it for.

A single ormar Model is simultaneously a database Model and a Pydantic serialization BaseModel:

class Author(ormar.Model):
    class Meta(BaseMeta):
        tablename = "authors"

    id: int = ormar.Integer(primary_key=True)
    name: str = ormar.String(max_length=100)

@miquelvir
Copy link

Take a look at https://github.com/tiangolo/sqlmodel, it solves this (and really nicely). It is also a project by @tiangolo .

@multimeric
Copy link

I'm confused as to why the FastAPI page about SQL databases makes no mention of SQLModel, whereas SQLModel has detailed documentation for integration with FastAPI. Could this be amended?

@simonsan
Copy link

@tiangolo I think you created https://github.com/tiangolo/pydantic-sqlalchemy with this issue in the back of your mind. Which approach would you currently recommend (orm_mode vs. external libraries also mentioned in this issue)?

@tiangolo tiangolo changed the title [QUESTION] How to bridge Pydantic models with SQLAlchemy? How to bridge Pydantic models with SQLAlchemy? Feb 24, 2023
@tiangolo tiangolo reopened this Feb 28, 2023
Repository owner locked and limited conversation to collaborators Feb 28, 2023
@tiangolo tiangolo converted this issue into discussion #8295 Feb 28, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
question Question or problem question-migrate
Projects
None yet
Development

No branches or pull requests