Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leak with each SSL web request #380

Open
akrherz opened this issue Jan 10, 2024 · 3 comments
Open

memory leak with each SSL web request #380

akrherz opened this issue Jan 10, 2024 · 3 comments

Comments

@akrherz
Copy link

akrherz commented Jan 10, 2024

I'm attempting to diagnose a memory leak with each treq.get call on a HTTPS website leaking small amounts of memory. This is my simple reproducer.

import treq
from twisted.internet.defer import inlineCallbacks
from twisted.internet import reactor


@inlineCallbacks
def main():
    for i in range(1000):
        # author owns this website, so hitting it is fine :)
        _req = yield treq.get('https://mesonet.agron.iastate.edu/robots.txt')
    reactor.stop()


reactor.callLater(0, main)
reactor.run()

running this with memray seems to indicate a SSL verify_paths leak in OpenSSL/SSL.py set_result = _lib.SSL_CTX_set_default_verify_paths(self._context)

    def set_default_verify_paths(self):
        """
        Specify that the platform provided CA certificates are to be used for
        verification purposes. This method has some caveats related to the
        binary wheels that cryptography (pyOpenSSL's primary dependency) ships:

        *   macOS will only load certificates using this method if the user has
            the ``openssl@1.1`` `Homebrew <https://brew.sh>`_ formula installed
            in the default location.
        *   Windows will not work.
        *   manylinux cryptography wheels will work on most common Linux
            distributions in pyOpenSSL 17.1.0 and above.  pyOpenSSL detects the
            manylinux wheel and attempts to load roots via a fallback path.

        :return: None
        """
        # SSL_CTX_set_default_verify_paths will attempt to load certs from
        # both a cafile and capath that are set at compile time. However,
        # it will first check environment variables and, if present, load
        # those paths instead
        set_result = _lib.SSL_CTX_set_default_verify_paths(self._context)

Is there a means to workaround this within treq? Perhaps create some context that can be reused throughout the lifetime of the running app?

@glyph
Copy link
Member

glyph commented Jan 11, 2024

@akrherz thanks for debugging this down to the root!

The problem with recycling the context is that it contains all the information associated with the connection like, for example, the hostname that we are connecting to. So sharing them between hosts, at least, is not practically possible.

However, with some re-plumbing within Twisted, we could store additional connection-specific data such as the hostname in connection.set_app_data rather than just storing the TLSMemoryBIOProtocol in there.

In any case, this issue would probably reproduce just fine with just twisted's Agent, so I think this issue should be reopened there instead. We can leave this open and linked for tracking though, since it definitely affects treq's behavior.

@akrherz
Copy link
Author

akrherz commented Jan 11, 2024

Thanks @glyph for chiming in and all your work on twisted over the years! I wonder if python/cpython#84904 is related? Will boggle this some more. I also have openssl 3.2.0 in play here on conda-forge, so lots of moving parts ;)

@glyph
Copy link
Member

glyph commented Jan 11, 2024

Thanks @glyph for chiming in and all your work on twisted over the years!

You're welcome! Kind of you to say so.

I wonder if python/cpython#84904 is related?

Seems likely that the actual OpenSSL function is the one that's leaking, in which case it would affect both in the same way?

Will boggle this some more. I also have openssl 3.2.0 in play here on conda-forge, so lots of moving parts ;)

OK, let me know what you discover.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants