New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid reloading root certificates to improve concurrent performance #6667
base: main
Are you sure you want to change the base?
Conversation
…hen `verify=True`
We've been hit by the slowness of
Is there anything I can do to advance this PR? |
Given this breaks the behavior of the module for a whole class of users as it's written today, there's not much to do to advance it. Even still, I'm pretty sure SSLComtext is not itself thread safe but I need to find a reference for that so loading it at the module will likely cause issues |
Here's your quote: python/cpython#95031 (comment)
Can't speak for the reliability of that quote though. Given that a lot of time is spent in |
Here is a more reliable reference from the openssl guys themselves about EDIT: Here are the official docs for openssl. Unfortunately the python SSLContext docs don't mention thread safety at all. Found this in the Python 3.13.0a5 release docs:
Asked for official clarification on the thread-safety of SSLContext here. |
If you don't trust me, then please trust David Benjamin's statement on threading. He is the main lead behind BoringSSL and did a lot of TLS stuff in Chrome browser. |
@tiran Thank you very much for your detailed commentary!
@sigmavirus24 What exactly are you unhappy about with the current solution and what would an acceptable solution for you look like? Would you prefer something like introducing a new "ssl_context" parameter somehow that could be set by the library user and be set to something like |
See #2118 I recommend that you either use urllib3 directly or switch to httpx. Most of the secret sauce of requests is in urllib3. httpx has HTTP/2 support. |
Requests supports being run (with certifi) inside a zip file created by a tool like pants, pyinstaller, etc. The code here removes that support in loading the trust stores. |
Ah, I see that the PR was updated and moved the extraction. I think the last blocker is the context being "public" in how it is named. That will encourage people to modify it in a way we don't want to be supporting. People will attempt to modify it anyway but when things go wrong, it will be clear that they weren't intended to modify it. To be clear, I expect people to use this to work around libraries that aren't written correctly that leverage requests but do not allow people to specify a trust store or customize a session. Either way, if we rename the default I'm in favor of merging this. Unless @nateprewitt has objections. (And yes, the caveats around changing ciphers or loading other trust stores is what I recalled being a problem that has odd behavior but even so, making this private will alleviate that likely spike in issues.) |
Thanks for the follow-up @sigmavirus24. I renamed the default context as requested, please let me know if you'd like any further changes. |
Reproducing the problem
Let's consider the following script. It runs a bunch of concurrent requests against a URL, both with certificate verification enabled and disabled, and outputs the time it takes to do it in both cases.
What's the time difference between the two? It turns out it is highly dependent on your local configuration. On my local machine, with a relatively modern config (Python 3.12 + OpenSSL 3.0.2), the times are
~1.2s
forverify=True
and~0.5s
forverify=False
.It's a >100% difference, but we initially blamed it on cert verification taking some time. However, we observed even larger differences (>500%) in some environments, and decided to find out what was going on.
Problem description
Our main use case for requests is running lots of requests concurrently, and we spent some time bisecting this oddity to see if there was room for a performance optimization.
The issue is a bit more clear if you profile the concurrent executions. When verifying certs, these are the top 3 function calls by time spent in them:
Conversely, this is how the top 3 looks like without cert verification:
In the first case, a full 0.68 seconds are spent in the
load_verify_locations()
function of thessl
module, which configures aSSLContext
object to use a set of CA certificates for validation. Inside it, there is a C FFI call to OpenSSL'sSSL_CTX_load_verify_locations()
which is known to be quite slow. This happens once per request (hence the30
on the left).We believe that, in some cases, there is even some blocking going on, either because each FFI call locks up the GIL or because of some thread safety mechanisms in OpenSSL itself. We also think that this is more or less pronounced depending on internal changes between OpenSSL's versions, hence the variability between environments.
When cert validation isn't needed, these calls are skipped which speeds up concurrent performance dramatically.
Submitted solution
It isn't possible to skip loading root CA certificates entirely, but it isn't necessary to do it on every request. More specifically, a call to
load_verify_locations()
happens when:A new
urllib3.connectionpool.HTTPSConnectionPool
is created.On connection, by urllib3's
ssl_wrap_socket()
, when the connection'sca_certs
orca_cert_dir
attributes are set (see the relevant code).The first case doesn't need to be addressed anymore after the latest addition of
_get_connection()
. Since it now passes downpool_kwargs
, this allows urllib3 to use a cached pool with the same settings every time, instead of creating one per request.The second one is addressed in this PR. If a verified connection is requested,
_urllib3_request_context()
already makes it so that a connection pool using aSSLContext
with all relevant certificates loaded is always used. Hence, there is no need to trigger a call toload_verify_locations()
again.You can test against https://invalid.badssl.com to check that
verify=True
andverify=False
still behave as expected and are now equally fast.I'd like to mention that there have been a few changes in Requests since I started drafting this, and I'm not sure that setting
conn.ca_certs
orconn.ca_certs = cert_loc
incert_verify()
is even still needed, since I think that the logic could be moved to_urllib3_request_context()
and benefit from using a cached context in those cases too.