Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

created_cloned_field — slow performance with many models #4644

Closed
9 tasks done
zanieb opened this issue Mar 4, 2022 · 13 comments
Closed
9 tasks done

created_cloned_field — slow performance with many models #4644

zanieb opened this issue Mar 4, 2022 · 13 comments

Comments

@zanieb
Copy link
Contributor

zanieb commented Mar 4, 2022

First Check

  • I added a very descriptive title to this issue.
  • I used the GitHub search to find a similar issue and didn't find it.
  • I searched the FastAPI documentation, with the integrated search.
  • I already searched in Google "How to X in FastAPI" and didn't find any information.
  • I already read and followed all the tutorial in the docs and didn't find an answer.
  • I already checked if it is not related to FastAPI but to Pydantic.
  • I already checked if it is not related to FastAPI but to Swagger UI.
  • I already checked if it is not related to FastAPI but to ReDoc.

Commit to Help

  • I commit to help with one of those options 👆

Example Code

import fastapi
import pydantic


class NestedModel(pydantic.BaseModel):
    x: pydantic.BaseModel
    y: pydantic.BaseModel


def create_app():
    for _ in range(100):
        fastapi.routing.APIRoute(
            "/test", endpoint=lambda: ..., response_model=NestedModel
        )


# PROFILING

import yappi

yappi.set_clock_type("CPU")

with yappi.run():
    create_app()

stats = yappi.get_func_stats()
stats.save("fastapi.pprof", type="pstat")

Description

When building a FastAPI application with nested Pydantic models, the create_cloned_field utility in the APIRoute initialization is quite slow.

For the trivial example, you can see that create_cloned_field dominates the runtime with 90% of CPU time. The majority of this is spent deep copying.

Note, timing is CPU time not WALL time

Profiling of example

If we replace this trivial application with the one from Prefect, from prefect.orion.api.server import create_app, we can see that this is significant in a real world example.

Profiling of Prefect app creation

With a patch to retain the cache across calls to this function, we can get this down to 50% of the CPU time with a ~5x overall speedup.

Profiling of example with patch

This speedup persists and is even more significant in a real-world application with create_cloned_field accounting for only 11% of the CPU time.

Profiling of Prefect app creation with patch

Operating System

macOS

Operating System Details

No response

FastAPI Version

0.74.0

Python Version

3.8.12

Additional Context

This may also be resolvable with pydantic/pydantic#1008 as mentioned in #894 (comment)

@ddanier
Copy link
Sponsor

ddanier commented Apr 6, 2022

We are currently having this issue in a bigger FastAPI project. The startup time did increase to a minute till several minutes over the last months - just to get the server running and on every code change while doing development. As a quick test we did just "remove" the create_cloned_field call which did reduce the startup time drastically.

So I would love to get this fixed. ;-)

@ddanier
Copy link
Sponsor

ddanier commented Apr 7, 2022

Little update: I did use the patch provided in the pull request after all, but just reverted this on my local dev setup as it did produce problems when working with ForwardRef's. Those seem to be not resolved any more, possibly because an "old version" of the model is cached now. Have to dig into this - but the patch will introduce regression bugs for sure....so just you know.

@ddanier
Copy link
Sponsor

ddanier commented Apr 7, 2022

Update again - this is a real issue:
We "fixed" the ForwardRef issue by calling update_forward_refs before any routes are defined. This way the cache does not mess up things - although I still think that has to be redone in a clean way.

Thing is besides taking for basically ever to load without the patch and loading in about 30 seconds with the patch....also the RAM usage skyrockets....which the patch also fixes.

Patch enabled RAM usage
No 1165 GiB
Yes 374 MiB

Again just FYI - we will need a good solution for that.

@zanieb
Copy link
Contributor Author

zanieb commented Apr 7, 2022

Thanks for giving this a test @ddanier ! Can you share an minimal complete example that uses forward refs and demonstrates how it breaks your routes? I can investigate adding handling for that case.

@ddanier
Copy link
Sponsor

ddanier commented Apr 7, 2022

@madkinsz I will do that....sadly I did not have the time yet to really look into this. All details above are bits my colleagues gave me when trying to work on our own performance issue. The project we try to use this is pretty huge and even the ForwardRefs are no just plain normal ForwardRefs (they are part of dynamically generated pydantic model)...so I will need to setup a new project to get a simple example for this. What I can say for sure is that the problem occurred after we did try the patch and thus must be a direct result of this.

Sorry if this takes some time, much to do currently. :)

@teebu
Copy link

teebu commented Jun 4, 2022

Any update on this?

@MarkusSintonen
Copy link

MarkusSintonen commented Aug 18, 2022

We are also heavily hit by this issue. But we added a hacky workaround for our tests which speeds up the test initialization hugely! It would be great to get the issue fixed.

The hack is here for anyone hit by the same issue:

FASTAPI_CLONED_TYPES_MEMO = {}

# Workaround for FastAPI initialization slowness causing slow test startup.
# https://github.com/tiangolo/fastapi/issues/4644
@pytest.fixture(scope="session")
def fix_slow_fastapi_issue(monkeypatch: MonkeyPatch) -> None:
    import fastapi.routing
    import fastapi.utils

    orig = fastapi.utils.create_cloned_field

    def patch(
        field: ModelField, *, cloned_types: Optional[Dict[Type[BaseModel], Type[BaseModel]]] = None
    ) -> ModelField:
        return orig(field, cloned_types=FASTAPI_CLONED_TYPES_MEMO if cloned_types is None else cloned_types)

    patcher.setattr(fastapi.utils, "create_cloned_field", patch)
    patcher.setattr(fastapi.routing, "create_cloned_field", patch)


@pytest.fixture(scope="session")
def http_app(fix_slow_fastapi_issue: None) -> FastAPI:
    return create_test_app()  # Does the FastAPI initialization

Profiles via pytest-profiling
Before:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
...
      818    0.009    0.000   13.524    0.017 routing.py:504(add_api_route)
      820    0.025    0.000   13.514    0.016 routing.py:308(__init__)
20409/778    0.123    0.000    9.486    0.012 utils.py:76(create_cloned_field)
...

After:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
...
      818    0.007    0.000    3.557    0.004 routing.py:504(add_api_route)
      820    0.023    0.000    3.548    0.004 routing.py:308(__init__)
 2839/778    0.016    0.000    1.086    0.001 utils.py:76(create_cloned_field)
...

cc @ddanier what issue are you seeing with the proposed fix do you have an example to give? We are also running a big FastAPI service but the patch seems to work for us.

@ddanier
Copy link
Sponsor

ddanier commented Sep 13, 2022

We are currently using a monkey patched version of the normal main.py like this:

# flake8: noqa
"""
This is a faster version of the main.py.

This version is faster by monkey patching the internals of FastAPI. It is
intended to NEVER be used in production. It is only for testing and
local development.

To can activate this file you may just create a `docker-compose.override.yml`
with the following contents:

---
version: "3.6"

services:
  api:
    command:
      [
        "poetry",
        "run",
        "uvicorn",
        "something.fast_main:app",
        "--host=0.0.0.0",
        "--reload",
      ]
---
"""

from dataclasses import is_dataclass
from typing import Optional, cast
from weakref import WeakKeyDictionary

from pydantic import BaseModel, create_model
from pydantic.fields import ModelField
from pydantic.utils import lenient_issubclass


def patched_create_cloned_field(
        field: ModelField,
        *,
        cloned_types: Optional[dict[type[BaseModel], type[BaseModel]]] = WeakKeyDictionary(),
) -> ModelField:
    # _cloned_types has already cloned types, to support recursive models
    if cloned_types is None:
        cloned_types = WeakKeyDictionary()
    original_type = field.type_
    if is_dataclass(original_type) and hasattr(original_type, "__pydantic_model__"):
        original_type = original_type.__pydantic_model__
    use_type = original_type
    if lenient_issubclass(original_type, BaseModel):
        original_type = cast(type[BaseModel], original_type)
        use_type = cloned_types.get(original_type)
        if use_type is None:
            use_type = create_model(original_type.__name__, __base__=original_type)
            cloned_types[original_type] = use_type
            for f in original_type.__fields__.values():
                use_type.__fields__[f.name] = patched_create_cloned_field(
                    f, cloned_types=cloned_types,
                )
    new_field = fastapi.utils.create_response_field(name=field.name, type_=use_type)
    new_field.has_alias = field.has_alias
    new_field.alias = field.alias
    new_field.class_validators = field.class_validators
    new_field.default = field.default
    new_field.required = field.required
    new_field.model_config = field.model_config
    new_field.field_info = field.field_info
    new_field.allow_none = field.allow_none
    new_field.validate_always = field.validate_always
    if field.sub_fields:
        new_field.sub_fields = [
            patched_create_cloned_field(sub_field, cloned_types=cloned_types)
            for sub_field in field.sub_fields
        ]
    if field.key_field:
        new_field.key_field = patched_create_cloned_field(
            field.key_field, cloned_types=cloned_types,
        )
    new_field.validators = field.validators
    new_field.pre_validators = field.pre_validators
    new_field.post_validators = field.post_validators
    new_field.parse_json = field.parse_json
    new_field.shape = field.shape
    new_field.populate_validators()
    return new_field


import fastapi  # noqa

fastapi.routing.create_cloned_field = patched_create_cloned_field


from something.main import app  # noqa

Note that the docs at the top only fit our own setup with running everything in docker and note that I did remove the app name (replaced by "something").

Anyways this seems to work for us now and we do not have any issues. I cannot reproduce the problems we had any more. RAM usage is still also down by a huge amount.

Nice thing about this additional file + the monkey patch is that we still can just build a normal production version that does include this.

@melvinkcx
Copy link

melvinkcx commented Nov 21, 2022

Thank you for the patch, @ddanier.

I can attest that this is an issue with one of the larger services we have.

Before patching:
Screenshot 2022-11-21 at 15 37 23

After patching:
Screenshot 2022-11-21 at 15 37 29

@Lawouach
Copy link

Just leaving a note that I'll try the patch as well.

When I start the app in itself, it's not so much of a problem per se but in my tests, this slowness is repeated for each test case and has rendered the tests very slow.

@varneyo
Copy link

varneyo commented Feb 16, 2023

Glad I found this issue thread, my project was getting too slow to debug. maybe it will get some love at some point @tiangolo ?

@spro
Copy link

spro commented Feb 22, 2023

Same here, especially annoying with --reload and trying to test small changes

@timabilov
Copy link

timabilov commented Feb 24, 2023

Although turning off(or caching here) response models will speed up startup a lot, it is still a problem with cold starts of Lambda.
Because internally FastAPI does a lot with dependency management and still using that create cloned fields somewhere.
So, if you a fat app it will take 10-30 seconds to initiate API. and thats already enough to get timeout in AWS

Repository owner locked and limited conversation to collaborators Feb 28, 2023
@tiangolo tiangolo converted this issue into discussion #8609 Feb 28, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

10 participants