Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using system trust stores by default in 3.0.0. #2966

Open
Lukasa opened this issue Jan 11, 2016 · 210 comments
Open

Consider using system trust stores by default in 3.0.0. #2966

Lukasa opened this issue Jan 11, 2016 · 210 comments

Comments

@Lukasa
Copy link
Member

Lukasa commented Jan 11, 2016

It's been raised repeatedly, mostly by people using Linux systems, that it's annoying that requests doesn't use the system trust store and instead uses the one that certifi ships. This is an understandable position. I have some personal attachment to the certifi approach, but the other side of that argument definitely has a reasonable position too. For this reason, I'd like us to look into whether we should use the system trust store by default, and make certifi's bundle a fallback option.

I have some caveats here:

  1. If we move to the system trust store, we must do so on all platforms: Linux must not be its own special snowflake.
  2. We must have broad-based support for Linux and Windows.
  3. We must be able to fall back to certifi cleanly.

Right now it seems like the best route to achieving this would be to use certitude. This currently has support for dynamically generating the cert bundle OpenSSL needs directly from the system keychain on OS X. If we added Linux and Windows support to that library, we may have the opportunity to switch to using certitude.

Given @kennethreitz's bundling policy, we probably cannot unconditionally switch to certitude, because certitude depends on cryptography (at least on OS X). However, certitude could take the current privileged position that certifi takes, or be a higher priority than certifi, as an optional dependency that is used if present on the system.

Thoughts? This is currently a RFC, so please comment if you have opinions. /cc @sigmavirus24 @alex @kennethreitz @dstufft @glyph @reaperhulk @morganfainberg

@dstufft
Copy link
Contributor

dstufft commented Jan 11, 2016

I think the system trust stores (or not) essentially boils down to whether you want requests to act the same across platforms, or whether you want it to act in line with the platform it is running on. I do not think that either of these options are wrong (or right), just different trade offs.

I think that it's not as simple on Windows as it is on Linux or OSX (although @tiran might have a better idea). I think that Windows doesn't ship with all of the certificates available and you have to do something (use WinHTTP?) to get it to download any additional certificates on demand. I think that means that a brand new Windows install, if you attempt to dump the certificate store will be missing a great many certificates.

On Linux, you still have the problem that there isn't one single set location for the certificate files, the best you can do is try to heuristically guess at where it might be. This gets better on Python 2.7.9+ and Python 3.4+ since you can use ssl.get_default_verify_paths() to get what the default paths are, but you can't rely on that unless you drop 2.6, 2.7.8, and 3.3. In pip we attempt to discover the location of the system trust store (just by looping over some common file locations) and if we can't find it we fall back to certifi, and one problem that has come up is that sometimes we'll find a location, but it's an old outdated copy that isn't being updated by anything. People then get really confused because it works in their browser, it works with requests, but it doesn't in pip.

I assume the fall back to certifi ensures that things will still work correctly on platforms that either don't ship certificates at all, or don't ship them by default and they aren't installed? If so, that's another possible niggle here that you'd want to think about. Some platforms, like say FreeBSD, don't ship them by default at all. So it's possible that people will have a requests using thing running just fine without the FreeBSD certificates installed, and they then install them (explicitly or implicitly) and suddenly they trust something different and the behavior of the program changes.

Anyways, the desire seems reasonable to me and, if all of the little niggles get worked out, it really just comes down to a question of if requests wants to fall on the side of "fitting in" with a particular platform, or if it wants to prefer cross platform uniformity.

@glyph
Copy link

glyph commented Jan 11, 2016

I think the system trust stores (or not) essentially boils down to whether you want requests to act the same across platforms, or whether you want it to act in line with the platform it is running on. I do not think that either of these options are wrong (or right), just different trade offs.

I wouldn't say that either option is completely wrong, but I do think that using the platform trust store is significantly right-er, due to the availability of tooling to adjust trust roots on the platform and the relative unavailability of any such tooling for requests or certifi. If you go into Keychain Access to add an anchor (or the Windows equivalent) nothing about requests makes it seem like it would be special, that it would be using a different set of trust roots than you had already configured for everything else.

@dstufft
Copy link
Contributor

dstufft commented Jan 11, 2016

It depends if your audience are people who are familiar with a particular platform or not. I have no idea how to manage the trust store on Windows but I know how to manage the trust store for requests because requests currently chooses being internally consistent cross platform over being externally consistent with any particular platform.

IOW this change makes it easier for people who are familiar with the system the software is running on at the cost of people who are not.

Sent from my iPhone

On Jan 11, 2016, at 4:29 PM, Glyph notifications@github.com wrote:

I wouldn't say that either option is completely wrong, but I do think that using the platform trust store is significantly right-er

@glyph
Copy link

glyph commented Jan 11, 2016

IOW this change makes it easier for people who are familiar with the system the software is running on at the cost of people who are not.

In the abstract, I disagree. Of course, we may have distribution or build toolchain issues which make people have to care about this fact, but if it works properly (pip installs without argument, doesn't require C compiler shenanigans for the end user) then what is the penalty to people who are not familiar with the platform?

@dstufft
Copy link
Contributor

dstufft commented Jan 11, 2016

It forces them to learn the differences of every platform they are running on.

Sent from my iPhone

On Jan 11, 2016, at 4:59 PM, Glyph notifications@github.com wrote:

what is the penalty to people who are not familiar with the platform?

@tempusfrangit
Copy link

So I am in favour of using the system store in general because in most cases [outside of dev] if you're relying on requests or a similar library you're going to expect it to work similar to the rest of the system/tooling. Asking someone to learn the tooling for the system they are deploying on is not unreasonable. If an application uses requests and needs to trust a specific cert (end-user story here), it is usual that it'll handle the installation for that platform at install time (aka OS X or Windows).

From a developer perspective, it becomes a little more difficult but still not insurmountable as long as we approach this in a way that the developer has clear methods to continue with the same behaviour as today.

As discussed in IRC, perhaps the easiest method to ensure sanity is to really finish up and polish certitude(? was this the tool discussed?) so we can encapsulate the platform/system-specifics clearly and try and ensure requests logic is the same on all of the platforms.

@glyph
Copy link

glyph commented Jan 11, 2016

It forces them to learn the differences of every platform they are running on.

How so? By default, it ought to get the trust roots you expect on every platform. It's not like I need to learn new and exciting things when I launch Safari vs. IE just to type https://...

@glyph
Copy link

glyph commented Jan 11, 2016

I think that it's not as simple on Windows as it is on Linux or OSX (although @tiran might have a better idea). I think that Windows doesn't ship with all of the certificates available and you have to do something (use WinHTTP?) to get it to download any additional certificates on demand. I think that means that a brand new Windows install, if you attempt to dump the certificate store will be missing a great many certificates.

You are right about this. I have verified it on my nearly-pristine Windows VM; in a Python prompt, I do:

>>> import wincertstore
>>> print(len(list(wincertstore.CertSystemStore("ROOT"))))

and get "21". Visit some HTTPS websites, up-arrow/enter in the python interpreter, and now I get "23".

@glyph
Copy link

glyph commented Jan 11, 2016

There's some technical documentation here:

https://technet.microsoft.com/en-us/library/bb457160.aspx

@glyph
Copy link

glyph commented Jan 11, 2016

And some more explanation here:

http://unmitigatedrisk.com/?p=259

@glyph
Copy link

glyph commented Jan 11, 2016

Frustratingly, I can't find an API that just tells it to grab the certificate store; it seems that verifying a certificate chain that you don't have the root to is the only way that it adds certificates, and it adds them one at a time as necessary. It baffles me that Microsoft seems to consider storage for certificates a scarce resource.

@glyph
Copy link

glyph commented Jan 11, 2016

After hours of scouring MSDN, I give up. Hopefully someone else can answer this question: https://stackoverflow.com/questions/34732586/is-there-an-api-to-pre-retrieve-the-list-of-trusted-root-certificates-on-windows

@dstufft
Copy link
Contributor

dstufft commented Jan 12, 2016

In my experience with pip, which attempts to discover the system store and if it can't find it falls back to a bundled copy, I have had to learn how the system store works on platforms that I have no intention on ever running. This is for a fairly simple method of detection (look for file systems) but it absolutely ends up that way. The simple fact is, in my experience most people have absolutely no idea how their system manages a trust store (and if it manages a trust store or not).

Sent from my iPhone

On Jan 11, 2016, at 5:22 PM, Glyph notifications@github.com wrote:

How so? By default, it ought to get the trust roots you expect on every platform. It's not like I need to learn new and exciting things when I launch Safari vs. IE just to type https://...

@dstufft
Copy link
Contributor

dstufft commented Jan 12, 2016

Look for files on the system*

@dstufft
Copy link
Contributor

dstufft commented Jan 12, 2016

FWIW, If I could prevent downstream redistributors from forcing pip to use the system store, I would revert the change to look in system locations immediately and only ever use a bundled copy. The UX of that tends to be so much nicer, the only reason we started to trust the system trust stores is because redistributors do patch pip to use the system trust store, so you end up in a situation where people get different trust stores based on where their copy of pip came from (which is likely also a concern for requests).

As an additional datapoint, if I remember correctly, the browsers which are not shipped with the OS tend to not use the system trust store either. According to Chrome's Root Certificate Policy they will use the system trust store on OSX and on Windows but they won't use it on Linux. I believe that even where Chrome does use the system store, they still layer their own trust ontop of that to allow them to blacklist certificates if need be. I assume this capability is in place because they do not wholly trust the OS trust stores to remove compromised certificates. If I recall correctly, Firefox does not use the system trust store at all on any OS.

Another question is what even is the "root trust store" on a Linux. The closest that you can get is wherever the system provided OpenSSL (assuming they even provide OpenSSL) is configured to point to. However AFAIK there is no way to determine if you're using a system provided OpenSSL or a copy that someone installed (perhaps via Anaconda?), the additional copy may have stale certificates or no certificates available. If it's stale certificates, then you've successfully lowered the security of requests user's by attempting to follow the system trust store. If it's an empty trust store, how do you determine the difference between "empty because I trust nothing" and "empty because my copy of OpenSSL isn't shipping them" or do I have to manage the certificates I trust using my OS, unless I want to trust nothing then I have to manage the certificates I trust using requests?

I've also found that talks about trying to use the platform certificate trust store on Linux (see here). I've not personally verified the information in this article, however it makes me feel very wary about trying to make using the system trust store anything but an exercise in frustration.

@kennethreitz
Copy link
Contributor

I think our current approach is the correct one, considering who Requests was built for.

That being said, it wouldn't hurt to add more documentation/functionality around using system certs for "advanced" users.

@glyph
Copy link

glyph commented Jan 13, 2016

I think talking about configurability is maybe a red herring. It's useful – necessary even – in certain circumstances, but users who know they need that can usually figure it out.

The more significant issue is that trust root database updates, especially for end-user client software, are both infrequent and extremely important to do in a timely manner. certifi has no mechanism for automatically updating. Not only will it not come down in a system update, it can't even be done globally; you have to do it once per Python environment (virtualenv, install, home directory, etc, where it's installed).

@tempusfrangit
Copy link

@glyph I think you may have hit the nail on the head there. That is a significant reason (and addresses the other issues outlined) to use a more centralized location for the cert store.

@dstufft
Copy link
Contributor

dstufft commented Jan 13, 2016

The flip side is that you have trust stores like Debian which trusted CACert for a long time, and still trusts SPI even though neither of those have gone through any sort of real audit that you'd expect for a trusted CA.

@glyph
Copy link

glyph commented Jan 13, 2016

The flip side is that you have trust stores like Debian which trusted CACert for a long time, and still trusts SPI even though neither of those have gone through any sort of real audit that you'd expect for a trusted CA.

As I understand it, Microsoft trusts their own CA, too, and Apple trusts theirs. SPI is just Debian's version of that, isn't it?

@dstufft
Copy link
Contributor

dstufft commented Jan 13, 2016

SPI isn't run by Debian, it's a third party organization similar to that of the Software Freedom Conservancy that Debian happens to be a member of. It'd be more like Microsoft and Apple trusting the CA of their datacenter just because they happened to use them as their data center. In addition, I'm pretty sure that Apple and Microsoft have both passed a WebTrust audit for their root CAs.

@dstufft
Copy link
Contributor

dstufft commented Jan 13, 2016

To be fair to Debian, I think the current plan is to stop using SPI certificates for their infrastructure and switch to more generally trusted certificates and then stop including SPI and switch to using just the Mozilla bundle without any additions.

That being said, is it even true that they are shipping updates to them? Looking at packages.debian.org for the ca-certificates packages it shows that the versions there are:

  • squeeze (oldoldstable): 20090814+nmu3squeeze1
  • wheezy (oldstable): 20130119+deb7u1
  • jessie (stable): 20141019
  • stretch (testing): 20160104
  • sid (unstable): 20160104

I haven't looked at the actual contents of these packages, but the version numbers lead me to believe that they are not infact keeping the ca-certificates package up to date.

In addition to that, looking at the open bugs for ca-certificates there are bugs like #721976 which means that the ca-certificates store includes roots which are not valid for validating servers and are only valid for other topics (like email) which means you can't actually use the current ca-certificate package without massaging it to remove those certificates yourself.

Another issue #808600 has Comodo requesting a removal of a particular root that they no longer consider to be inscope for the CAB
Forum's Baseline Requirements. That has been removed from testing and sid, but has not been removed from jessie, wheezy, or squeeze. The maintainers of ca-certificates claim they'll be requesting an upload to jessie and wheezy, but not to squeeze (which still has LTS suport).

That's just from spending a little bit of time looking at one, fairly popular, distribution. I imagine the long tail of the issues with the system provided bundle gets worse the further away from the popular distributions you get. It's not clear to me that it's a reasonable assumption that the certificates included in any random OS are going to be properly maintained.

The other elephant in the room is we're just assuming that because an update is available to their ca-certificates package that someone is going to have pulled it in. As far as I know, most (if not all?) of the Linux systems do not automatically update by default and require configuration to start to do so. There is likely to be a bigger chance of this with Docker in the picture. On the flip side, I think people generally try to update to the latest versions of their dependencies when working on a project.

@Lukasa
Copy link
Member Author

Lukasa commented Jan 13, 2016

Debian's cert bundle is almost certainly like ubuntu's, which does not update to the Mozilla bundle that removed 1024-bit roots in order to avoid the pain like that which hit certifi. All of the out-of-date cert bundles are OpenSSL pre 1.0.2, which means they cannot correctly build the cert chain to a cross-signed root without having the 1024-bit cross-signing root still present. I suspect that's the real concern there.

@dstufft
Copy link
Contributor

dstufft commented Jan 13, 2016

Maybe they should stop shipping an OpenSSL that can't correctly validate certificate chains.

@alex
Copy link
Member

alex commented Jan 13, 2016

I reached out to Kurt Roebx about backporting the fixes for that, he said
it was a thing he was looking at doing, I have no clue what the timeline is.

On Wed, Jan 13, 2016 at 7:27 AM, Donald Stufft notifications@github.com
wrote:

Maybe they should stop shipping an OpenSSL that can't correctly validate
certificate chains.


Reply to this email directly or view it on GitHub
https://github.com/kennethreitz/requests/issues/2966#issuecomment-171276369
.

"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
GPG Key fingerprint: 125F 5C67 DFE9 4084

@sigmavirus24
Copy link
Contributor

So, what if we made the behaviour of looking at the system CA bundle an extra, e.g., pip install requests[system_ca] where it would ship without our certificate bundle and instead use certifi. This allows us to keep on "just work" ing for our default user base while supporting the people who need to use the system bundle.

@andrewleech
Copy link

Ah sure, that makes sense.

While I appreciate the value of consistency across platforms and deeply understand the complexities here, this issue does continue to make a lot of end-user tools pretty much impossible to use in scenarios where custom/corporate/self-signed are needed. In these cases the code solutions don't help, particularly when the tool end users aren't python developers.

I've got a wrapt based code injection module called pip_system_certs which works around the problem by appending to the end of sessions.Session.__init__

    class SslContextHttpAdapter(sessions.HTTPAdapter):
        """Transport adapter that allows us to use system-provided SSL certificates."""
        def init_poolmanager(self, *args, **kwargs):
            import ssl
            ssl_context = ssl.create_default_context()
            ssl_context.load_default_certs()
            kwargs['ssl_context'] = ssl_context
            return super(SslContextHttpAdapter, self).init_poolmanager(*args, **kwargs)

    secure_adapter = SslContextHttpAdapter(max_retries=retries)
    instance.mount("https://", secure_adapter)

While I understand that load_default_certs() is very much an imperfect solution, it is "good enough" for a lot of use cases. Would a PR like this likely be acceptable to include this code pattern if an evn variable like REQUESTS_USE_PLATFORM_DEFAULT_CERTS or similar is set? It would allow some/many people to get unblocked at least.

@emixa-d
Copy link

emixa-d commented Apr 20, 2022

I created https://github.com/tiran/distro-truststore to learn more about GHA and to investigate trust store situation on platforms and various Linux distributions.

Results

* Most platforms have a working `/etc/ssl` directory with either hashed cert directory or a cert bundle.

Guix System does have a /etc/ssl/certs directory and /etc/ssl/certs/ca-certificates.crt too. However, it is recommended that applications look into the environment variable $SSL_CERT_FILE / $SSL_CERT_DIR instead and only use /etc/ssl/certs/ca-certificates.crt and /etc/ssl/certs as fallbacks. That way, the user can override which certificates to trust without having to have root access or play container tricks.

Also, thank you for looking into this -- having to keep track of what packages bundle their own copies is tedious (read: mostly doesn't happen), only the main nss-certs and le-certs packages are reliably-ish updated.

There are also some other problems with packages depending on python-certifi, rust-webpki-roots and perl-mozilla-ca -- not only do people rarely think of actually updating them, updating them is a bit of a slow process because of their many dependencies that need to be rebuilt (957 for python-certifi and perl-mozilla-ca, information on rust crates is currently unreliable).

Edit: Also, nice to make the standard mechanism for changing the certificates ($SSL_CERT_DIR / variants of /etc/ssl/certs / whatever mechanism other OS have) work, instead of the user having to learn the application/library-specific mechanism every time.

@yrro
Copy link

yrro commented Apr 21, 2022

However, it is recommended that applications look into the environment variable $SSL_CERT_FILE / $SSL_CERT_DIR instead and only use /etc/ssl/certs/ca-certificates.crt and /etc/ssl/certs as fallbacks. That way, the user can override which certificates to trust without having to have root access or play container tricks.

I believe this is behaviour specific to OpenSSL. We really don't need apps re-implementing this logic, it's only going to further confuse things.

@tiran
Copy link
Contributor

tiran commented Apr 21, 2022

... and that's how we ended up with PIP_CERT, TWINE_CERT, REQUESTS_CA_BUNDLE, CURL_CA_BUNDLE, and similar env vars. It's a nightmare to audit, sanitize and configure correctly. Tell me, how are these application specific env vars less confusing than the de-facto standard SSL_CERT_FILE?

@mhils
Copy link
Contributor

mhils commented Apr 21, 2022

Since it hasn't been brought up here yet, @sethmlarson and @davisagli have started to write a library to verify certificates using OS trust stores: https://github.com/sethmlarson/truststore

@emixa-d
Copy link

emixa-d commented Apr 21, 2022 via email

@tiran
Copy link
Contributor

tiran commented Apr 21, 2022

FWIW Python's ssl module, Golang's crypto/x509 package, and others support SSL_CERT_FILE / DIR as well.

@nanonyme
Copy link
Contributor

nanonyme commented Apr 21, 2022

Note just letting underlying crypto cert store do its stuff and not even try to pass it cert path is by far sanest approach with both OpenSSL and GnuTLS and allows them to do whatever makes sense on the system (including integrating with p11-kit).

@JonathonReinhart
Copy link

JonathonReinhart commented Oct 11, 2022 via email

@EmperorArthur
Copy link

Pip is experimenting with using the truststore package to solve this exact problem.

Unfortunately, per the documentation for that package, using it with requests is not as easy!

@sigmavirus24
Copy link
Contributor

This project should be considered experimental so shouldn’t be used in production.

From trust store's documentation.

Also how dare a library that existed before SSLContext under a feature and API freeze not just magically work when the two people maintaining it have real lives, full time jobs, and continuously are criticized by the community for not being able to predict the future.

@nanonyme
Copy link
Contributor

To be fair, that is the problem with all API's. No one can predict the future hence all API's must be designed with deprecations, changes and removals in mind.

@sigmavirus24
Copy link
Contributor

Yes, in an ideal world that would have happened. The last time requests made a backwards incompatible change, even with deprecation warnings, it was very poorly received and the abuse towards the maintained was unacceptable

@sbreit-tomra
Copy link

No activity in almost a year - is this dead?

@sethmlarson
Copy link
Member

Truststore already allows you to use system trust stores with Requests, if you need support now: https://truststore.readthedocs.io

@achapkowski
Copy link

@Lukasa @sethmlarson I think there needs to be a better way to pass in SSLContext objects into Session objects. I know adapter objects can be created to override the poolmanager, but this feels a bit heavy handed. I would propose an enhancement to allow Session objects to accept the context object.

I know development is limited on the repo, but it would really help many ssl/certificate issues.

@ofek
Copy link
Contributor

ofek commented Apr 4, 2024

Please consider using urllib3 directly or switching to HTTPX, which is what I've done everywhere I can. Either option is fine and is preferable to the current state of things. The only thing that potentially may be lost is a reduced number of plugins.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests