Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant performance degradation because of usage of gather instead of serial await #40

Closed
kevinvalk opened this issue Apr 6, 2023 · 2 comments

Comments

@kevinvalk
Copy link

Scenario: Python 3.11 GraphQL gateway using Ariadne with lots of nested data

During development I found a significant performance degradation. I raised this issue in GraphQL core graphql-python/graphql-core#190 . After some more research I found that using gather on CPU bound tasks causes significant overhead (graphql-python/graphql-core#190 (comment)). In the case of CPU bound async tasks it is better to use sequential await.

So I monkey patched gather into serial await in GraphQL core, but I still had very slow responses. Today I finally dove into this problem again and I saw that there was another gather in aiodataloader!

aiodataloader-with-gather

As far as I understand, the goal of the dataloader (when using with cache) is to only cause a few IO bound lookups and serve all other loads directly through the cache. This means that we will have a CPU bound usage of gather. I monkey patched the aiodataloader gather into a serial await and my requests went from 3s -> 500ms.

aiodataloader-with-serial-await

I am not sure if this is always the case (for example when not using cache), but as long as you want cache you really need to have a serial await. Maybe I am missing something (please let me know), but I would suggest to add a serial await to the load_many if cache is being used.


async def serial_gather(*futures: Awaitable[Any]):
    return [await future for future in futures]

aiodataloader = import_module("aiodataloader")

def load_many(self, keys: Iterable[Any]) -> "Future[List[ReturnT]]":
    """
    Loads multiple keys, returning a list of values

    >>> a, b = await my_loader.load_many([ 'a', 'b' ])

    This is equivalent to the more verbose:

    >>> a, b = await gather(
    >>>    my_loader.load('a'),
    >>>    my_loader.load('b')
    >>> )
    """
    if not isinstance(keys, Iterable):
        raise TypeError(
            ("The loader.load_many() function must be called with Iterable<key> but got: {}.").format(keys)
        )

    return serial_gather(*[self.load(key) for key in keys])

aiodataloader.DataLoader.load_many = load_many
@markedwards
Copy link
Collaborator

Doing this would lead to very poor performance when retrieving keys that are not in the cache. It essentially defeats batching, which is the entire point of the DataLoader pattern.

What’s the use-case which leads to calling load_many() on hundreds of thousands of keys where the results are already cached? Are you using aiodataloader as an application-level cache?

@kevinvalk
Copy link
Author

Found the "problem" (which ended up being a user error). If you are running Python in debug mode the asyncio loop is also set to debug. This ensures that on each context switch the full stack trace is kept. This is quite expensive so when using Gather (depending on your workload) the tasks may end up switching A LOT which completely kills performance. In production this is not a problem because the asyncio loop is not set to debug.

TL;DR; If you want to get the actual performance disable debug on your asyncio loop

loop = asyncio.get_running_loop()
loop.set_debug(False)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants