Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTPX AsyncClient slower than aiohttp? #838

Closed
imbolc opened this issue Mar 1, 2020 · 22 comments
Closed

HTTPX AsyncClient slower than aiohttp? #838

imbolc opened this issue Mar 1, 2020 · 22 comments
Labels
discussion perf Issues relating to performance

Comments

@imbolc
Copy link

imbolc commented Mar 1, 2020

I just found async-client to be significantly slower comparing to aiohttp, escecially for single request:

aiohttp httpx aiohttp/httpx
single, rps 1062 141 7.5
session, rps 1862 1197 1.5
session/single 1.8 8.5

I tried both release and master, results are pretty the same. The code of the benchmark is here: https://gist.github.com/imbolc/15cab07811c32e7d50cc12f380f7f62f

@florimondmanca florimondmanca added discussion perf Issues relating to performance labels Mar 1, 2020
@florimondmanca florimondmanca changed the title Performance issue HTTPX AsyncClient slower than aiohttp Mar 1, 2020
@florimondmanca
Copy link
Member

florimondmanca commented Mar 1, 2020

Thanks, this is very interesting quantified insights!

I went ahead and added the aiohttp/httpx ratio to your table.

I used your script to run the benchmark on my own machine, and observe similar results:

aiohttp httpx aiohttp/httpx
single, rps 220 48 4.6
client, rps 495 331 1.5
client/single 2.3 6.9

@florimondmanca
Copy link
Member

The client/single ratio for HTTPX is not surprising to me — we know that using a client significantly increases performance.

The aiohttp/httpx ratio in the client case isn't surprising either — I had already noted that we were slower than aiohttp in the past (don't think I posted the results on GitHub though).

What's more surprising to me is, as you said, the 3x higher aiohttp/httpx ratio in the single case. I interpret it as "setting up a client (or opening a single connection, or whatever) is not as efficient as it could be".

I'll run httpxprof over your scripts and see what other insights I can come up with. :-)

@florimondmanca
Copy link
Member

florimondmanca commented Mar 1, 2020

Okay, so apparently one massively useless thing we do is eager TLS setup on Client instantiation, even though none of the eventually requested URLs uses HTTPS.

If we run:

pip install -e git+https://github.com/florimondmanca/httpxprof.git#egg=httpxprof
httpxprof run async_single
httpxprof view async_single

We get this view:

Screenshot 2020-03-01 at 13 50 35

=> 92% (!) of the time spent instantiating the client (which itself is 62% of the total time making a single request with a client) is spent setting up SSL (.load_ssl_context()).

Compare this to aiohttp -- client instantiation is not even visible on this graph, and the bulk of the time is only spent, well, making the actual request to the non-TLS URL:

Screenshot 2020-03-01 at 13 52 49

I used the following to get the profile above:

# aiohttp_single.py
import asyncio
import aiohttp
import httpxprof

async def main() -> None:
    for _ in httpxprof.requests():
        async with aiohttp.ClientSession() as session:
            async with session.get(httpxprof.url):
                pass

asyncio.run(main())
httpxprof run aiohttp_single.py
httpxprof view aiohttp_single.py

So one action point already might be to lazily load the SSL configuration on the first request that uses HTTPS.

@victoraugustolls
Copy link

victoraugustolls commented Mar 1, 2020

I might be missing something, but what if loading the SSL context is needed in both aiohttp and httpx, is the time doing so somewhat similar? And the total time? I can also try to take this numbers later

I see that in the gist the host is http and not https, that’s why I’m questioning this

@florimondmanca
Copy link
Member

what if loading the SSL context is needed in both aiohttp and httpx

Yes, I actually just realized that optimizing for requests against an insecure host is probably marginally useful for real-world situations anyway. We should be comparing aiohttp and HTTPX requesting a host over HTTPS.

I'm going to setup a local server with HTTPS turned on (see instructions here and run the profiling again, this time requesting https://localhost:8000. 👍

@victoraugustolls
Copy link

Don't know how much this would affect the final result, but aiohttp uses cchardet and aiodns for optimization.

@imbolc
Copy link
Author

imbolc commented Mar 1, 2020

optimizing for requests against an insecure host is probably marginally useful for real-world situations anyway

Yep, an important one is a bunch of microservices spinning on localhost

@florimondmanca
Copy link
Member

@imbolc I forked your gist and created a version that runs wrk on a server running on HTTPS, in which both aiohttp and HTTPX request the server using the same CA bundle: https://gist.github.com/florimondmanca/fbc85b58e9ce61e74b73df1e42829838

Running it on my machine, I get the updated results below:

aiohttp httpx aiohttp/httpx
single (req/s) 121 48 2.5
client (req/s) 284 221 1.2
client/single 2.3 4.6 0.5

The difference for the single request case went from 8x to 2-3x, which is more reasonable, and not entirely surprising to me (we haven't been very focused in optimization so far).


Besides, running httpxprof again on an HTTPS server, in the single-request case:

  • Client instantiation now only represents 33% of the time spent making a single request (instead of 60+%).
  • TLS setup is now only about 50% of client instantiation time -- so about 15% of the total.

(The improvement with the previous setup might come from the fact that I now explicitly pass verify="client.pem", whereas previously HTTPX had to lookup certs via certifi.)

The aiohttp equivalent setup (using SSL Control for TCP Sockets) spends about 16% of the time in ssl.create_default_context().

So actually, there's no real burden on our side due to TLS/SSL.

Still, aiohttp looks to set up TLS more efficiently than we do. We get our certs from certifi (I've already seen some people argue this may not actually be the best choice?), but I'm not sure how aiohttp handles defaults certs… Anyone's got a clue? I haven't seen anything in their docs.

@florimondmanca
Copy link
Member

but I'm not sure how aiohttp handles defaults certs…

Ah, so from ClientSession.request() I see that they use ssl.create_default_context():

ssl: SSL validation mode. None for default SSL check (ssl.create_default_context() is used) [...]

I tried using httpx.get("https://google.com", verify=ssl.create_default_context()), and it works like a charm. So do we need certifi? Could this be related to #302?

@florimondmanca florimondmanca changed the title HTTPX AsyncClient slower than aiohttp HTTPX AsyncClient slower than aiohttp? Mar 1, 2020
@victoraugustolls
Copy link

Don't know if it 100% performance related, but on #832 I added a repository with sample code where timeouts happen with httpx, but not with aiohttp with the same request volume.

@florimondmanca
Copy link
Member

Hi, going to close this off as "yes, we're maybe 2x-3x slower than aiohttp, but getting up to speed there isn't a priority for 1.0", though PRs on anything that might be a performance burden are still very much welcome! Thanks all.

@tomchristie
Copy link
Member

I'd be pretty skeptical that we're comparing like-for-like (eg. there's various SSL, .netrc behaviour etc. stuff that may differ and mean that we have a heavier client instantiation than aiohttp), or that single requests to a local server are a meaningful metric.

If we do look at any benchmarking at some point, I'd want to look at measurements after a client instance is created, since you really want to be instantiating a single client instance which is then used for the lifetime of the app.

@imbolc
Copy link
Author

imbolc commented Mar 16, 2020

I'd want to look at measurements after a client instance is created

It's included into the benchmark, the second row in the table

@JustJia
Copy link

JustJia commented Dec 16, 2020

Okay, so apparently one massively useless thing we do is eager TLS setup on Client instantiation, even though none of the eventually requested URLs uses HTTPS.

If we run:

pip install -e git+https://github.com/florimondmanca/httpxprof.git#egg=httpxprof
httpxprof run async_single
httpxprof view async_single

We get this view:

Screenshot 2020-03-01 at 13 50 35

=> 92% (!) of the time spent instantiating the client (which itself is 62% of the total time making a single request with a client) is spent setting up SSL (.load_ssl_context()).

Compare this to aiohttp -- client instantiation is not even visible on this graph, and the bulk of the time is only spent, well, making the actual request to the non-TLS URL:

Screenshot 2020-03-01 at 13 52 49

I used the following to get the profile above:

# aiohttp_single.py
import asyncio
import aiohttp
import httpxprof

async def main() -> None:
    for _ in httpxprof.requests():
        async with aiohttp.ClientSession() as session:
            async with session.get(httpxprof.url):
                pass

asyncio.run(main())
httpxprof run aiohttp_single.py
httpxprof view aiohttp_single.py

So one action point already might be to lazily load the SSL configuration on the first request that uses HTTPS.

Okay, so apparently one massively useless thing we do is eager TLS setup on Client instantiation, even though none of the eventually requested URLs uses HTTPS.

If we run:

pip install -e git+https://github.com/florimondmanca/httpxprof.git#egg=httpxprof
httpxprof run async_single
httpxprof view async_single

We get this view:

Screenshot 2020-03-01 at 13 50 35

=> 92% (!) of the time spent instantiating the client (which itself is 62% of the total time making a single request with a client) is spent setting up SSL (.load_ssl_context()).

Compare this to aiohttp -- client instantiation is not even visible on this graph, and the bulk of the time is only spent, well, making the actual request to the non-TLS URL:

Screenshot 2020-03-01 at 13 52 49

I used the following to get the profile above:

# aiohttp_single.py
import asyncio
import aiohttp
import httpxprof

async def main() -> None:
    for _ in httpxprof.requests():
        async with aiohttp.ClientSession() as session:
            async with session.get(httpxprof.url):
                pass

asyncio.run(main())
httpxprof run aiohttp_single.py
httpxprof view aiohttp_single.py

So one action point already might be to lazily load the SSL configuration on the first request that uses HTTPS.

How do you got the code Flame Graph ? It look like usefull

@florimondmanca
Copy link
Member

@JustJia It's from httpxprof which is a small wrapper around SnakeViz, code is here :) https://github.com/florimondmanca/httpxprof

@omerXfaruq
Copy link

omerXfaruq commented Jan 28, 2022

I'd be pretty skeptical that we're comparing like-for-like (eg. there's various SSL, .netrc behaviour etc. stuff that may differ and mean that we have a heavier client instantiation than aiohttp), or that single requests to a local server are a meaningful metric.

If we do look at any benchmarking at some point, I'd want to look at measurements after a client instance is created, since you really want to be instantiating a single client instance which is then used for the lifetime of the app.

Hello @tomchristie ;

Would like to ask your current thoughts about the speed comparison between aiohttp and httpx, since some time has passed after this comment. I believe these two sources contain a fair comparison between the two libraries. It seems aiohttp is relatively faster than httpx, I wonder what could be the reasons for that?

  1. https://blog.jonlu.ca/posts/async-python-http
  2. https://developpaper.com/requests-aiohttp-httpx-comparison/

Let me reference tom's comment about the advantages of httpx .

@tomchristie
Copy link
Member

I wonder what could be the reasons for that

I've not dug into the places you've linked, but a couple of things to note...

  • aiohttp performs DNS caching.
  • httpx/httpcore and requests/urllib3 do not

For certain types of requests that'll make a difference.
At some point I'd expect we'll look into supporting that too.

Also we fairly recently resolved an issue with upload speeds (#1948)

Tho I'll happily spend time looking into this if what you've posted doesn't match up with either of those two possible cases.

@omerXfaruq
Copy link

I've not dug into the places you've linked, but a couple of things to note...

* `aiohttp` performs DNS caching.

* `httpx`/`httpcore` and `requests`/`urllib3` do not

@tomchristie that certainly could be the case, since all the requests are sent to the same ip in both of the benchmarks. Supporting DNS caching would be great for httpx/httpcore as well.
Btw I feel like the speed difference is still huge, I would still suggest you look into the benchmarks, a few ss below.

image
image

@mrkovalchuk
Copy link

mrkovalchuk commented Sep 29, 2022

As I can see, we try to load ca certificates for every connection (single request case).

class SSLConfig:
    ...
    DEFAULT_CA_BUNDLE_PATH = Path(certifi.where())
    ...
    def load_ssl_context_verify(self) -> ssl.SSLContext:
        elif isinstance(self.verify, bool):
            ca_bundle_path = self.DEFAULT_CA_BUNDLE_PATH
        ...
        if ca_bundle_path.is_file():
            logger.trace(f"load_verify_locations cafile={ca_bundle_path!s}")
            context.load_verify_locations(cafile=str(ca_bundle_path))
        elif ca_bundle_path.is_dir():
            logger.trace(f"load_verify_locations capath={ca_bundle_path!s}")
            context.load_verify_locations(capath=str(ca_bundle_path))
        ...

aiohttp just uses the default context built by ssl.create_default_context without providing any file/path to certificates.

As @florimondmanca says, the custom context without extra verification removes the huge gap from aiohttp.

Should we do it the same way? Or what is the reason to do it our way?

@chdsbd
Copy link

chdsbd commented Oct 5, 2022

Just want to chime in here and say that I've been noticing load_ssl_context_verify to be a huge performance problem in my app. We make a lot our network calls and most of our time is spent in load_ssl_context_verify

@sbdchd
Copy link

sbdchd commented Oct 25, 2022

did some benchmarking on this and caching the ssl context makes a huge difference in perf.

ugly hack for testing, but the general idea for the caching:

diff --git a/httpx/_config.py b/httpx/_config.py
index d164e4c..262763b 100644
--- a/httpx/_config.py
+++ b/httpx/_config.py
@@ -50,6 +50,7 @@ def create_ssl_context(
         cert=cert, verify=verify, trust_env=trust_env, http2=http2
     ).ssl_context
 
+context = None
 
 class SSLConfig:
     """
@@ -99,11 +100,17 @@ class SSLConfig:
         """
         Return an SSL context for verified connections.
         """
+        global context
         if self.trust_env and self.verify is True:
             ca_bundle = get_ca_bundle_from_env()
             if ca_bundle is not None:
                 self.verify = ca_bundle
 
+        if context is None:
+            context = ssl.create_default_context()
+        self._load_client_certs(context)
+        return context
+
         if isinstance(self.verify, ssl.SSLContext):
             # Allow passing in our own SSLContext object that's pre-configured.
             context = self.verify

some flame graphs generated with pyspy:

sudo ./.venv/bin/py-spy record --pid 10056 --output foo.svg

aiohttp
httpx-nocached-ssl-context
httpx-cached-ssl-context

import asyncio
import httpx
import aiohttp

URL = "https://example.com"

async def main() -> None:
    for _ in range(0, 10_000):
        async with httpx.AsyncClient() as client:
            r = await client.get(URL)
            print(r.status_code)


async def main2() -> None:
    for _ in range(0, 10_000):
        async with aiohttp.ClientSession() as session, session.get(URL) as r:
            print(r.status)

if __name__ == '__main__':
    asyncio.run(main2())
    # asyncio.run(main())

@sbdchd
Copy link

sbdchd commented Oct 25, 2022

The work around mentioned in #838 (comment) can also be used on client instantiation:

import asyncio
import httpx
import ssl

URL = "https://example.com"

# "cache" at module level
context = ssl.create_default_context()

async def main() -> None:
    while True:
        async with httpx.AsyncClient(verify=context) as client:
            r = await client.get(URL)
            print(r.status_code)

if __name__ == '__main__':
    asyncio.run(main())

Running the following code w/o and w/ the verify param:

import asyncio
import httpx
import ssl

# "cache" at module level
context = ssl.create_default_context()

async def main() -> None:
    while True:
        async with httpx.AsyncClient(verify=context) as client:
            print("foo")

if __name__ == '__main__':
    asyncio.run(main())

Before:
fixed-day-4

After:
fixed-day-5

kodiakhq bot pushed a commit to chdsbd/kodiak that referenced this issue Oct 25, 2022
Work around for a perf issue with how httpx handles SSL.

Instead of using `httpx.AsyncClient` directly, we subclass and reuse the ssl context.

rel: encode/httpx#838
Mark90 added a commit to workfloworchestrator/oauth2-lib that referenced this issue Sep 7, 2023
pboers1988 pushed a commit to workfloworchestrator/oauth2-lib that referenced this issue Sep 11, 2023
* Create HTTPX_SSL_CONTEXT once and use it in AsyncClient instances

Background: encode/httpx#838

* Refactor OIDCUser.__call_ to exit early and create a scoped AsyncClient with existing ssl context

* Simplify refresh_client_creds_token

* Bump version to 1.3.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion perf Issues relating to performance
Projects
None yet
Development

No branches or pull requests

10 participants