Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for HTTPS Proxy #714

Closed

Conversation

karpetrosyan
Copy link
Member

@karpetrosyan karpetrosyan commented Jun 9, 2023

Add support for HTTPS proxies

Refs

@karpetrosyan karpetrosyan added the enhancement New feature or request label Jun 9, 2023
@karpetrosyan
Copy link
Member Author

The proxy logic now looks very similar to urllib3 proxies with one exception: they support HTTPS in HTTPS (forwarding) using use_forwarding_for_https, whereas we support all possible combinations except HTTP in HTTPS (forward).

@karpetrosyan
Copy link
Member Author

karpetrosyan commented Jun 9, 2023

Exapmles

proxy_ssl_context

import httpcore

pool = httpcore.HTTPProxy('https://127.0.0.1:8080', proxy_mode="FORWARDING")
resp = pool.request("GET", 'https://www.youtube.com')
# httpcore.ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)
import httpcore
import ssl

proxy_context = ssl.create_default_context()
proxy_context.check_hostname = False
proxy_context.verify_mode = False

pool = httpcore.HTTPProxy('https://127.0.0.1:8080', proxy_mode="FORWARDING", proxy_ssl_context=proxy_context)
resp = pool.request("GET", 'https://www.youtube.com')
print(resp.status)
200

@karpetrosyan karpetrosyan requested review from a team and florimondmanca June 9, 2023 11:35
@tomchristie
Copy link
Member

Ah this is very neat, thanks! 😎

Could we keep it sharp, and just add one change-per-pull-request. Adding just proxy_ssl_context would be a more neatly reviewable piece of work, rather than having two changes at once. Seem reasonable?

@karpetrosyan
Copy link
Member Author

I got it, and thanks for reviewing; I'll open another PR for proxy_mode.

@karpetrosyan karpetrosyan mentioned this pull request Jun 10, 2023
@tomchristie
Copy link
Member

Okay, here's where I've go to with testing this...

I referred back to a previous comment encode/httpx#1434 (comment) in order to try this out with proxy.py:

$ venv/bin/pip install trustme-cli proxy.py
$ venv/bin/trustme-cli 
$ venv/bin/proxy --port 6000 --hostname 127.0.0.1 --cert-file server.pem --key-file server.key 

Testing with urllib3

As a point-of-reference let's start out by testing connecting through our SSL proxy with urllib3...

import certifi
import urllib3
import ssl

proxy_ssl_context = ssl.create_default_context()
proxy_ssl_context.load_verify_locations("client.pem")

with urllib3.ProxyManager(
    'https://127.0.0.1:6000/',
    ca_certs=certifi.where(),
    proxy_ssl_context=proxy_ssl_context
) as http:
    r = http.request('GET', 'https://example.com/')
    print(r, r.status)
    # <urllib3.response.HTTPResponse object at 0x10f302080> 200

Testing with httpcore, trio

import httpcore
import certifi
import ssl
import trio

proxy_ssl_context = ssl.create_default_context()
proxy_ssl_context.load_verify_locations("client.pem")

ssl_context = ssl.create_default_context()
ssl_context.load_verify_locations(certifi.where())

async def main():
    async with httpcore.AsyncHTTPProxy(
        'https://127.0.0.1:6000/',
        ssl_context=ssl_context,
        proxy_ssl_context=proxy_ssl_context
    ) as http:
        r = await http.request('GET', 'https://example.com/')
        print(r)
        # <Response [200]>

trio.run(main)

Testing with httpcore, asyncio

import httpcore
import certifi
import ssl
import asyncio

proxy_ssl_context = ssl.create_default_context()
proxy_ssl_context.load_verify_locations("client.pem")

ssl_context = ssl.create_default_context()
ssl_context.load_verify_locations(certifi.where())

async def main():
    async with httpcore.AsyncHTTPProxy(
        'https://127.0.0.1:6000/',
        ssl_context=ssl_context,
        proxy_ssl_context=proxy_ssl_context
    ) as http:
        r = await http.request('GET', 'https://example.com/')
        print(r)
        # <Response [200]>

asyncio.run(main())

Testing with httpcore, sync

import httpcore
import certifi
import ssl


proxy_ssl_context = ssl.create_default_context()
proxy_ssl_context.load_verify_locations("client.pem")

ssl_context = ssl.create_default_context()
ssl_context.load_verify_locations(certifi.where())

with httpcore.HTTPProxy(
    'https://127.0.0.1:6000/',
    ssl_context=ssl_context,
    proxy_ssl_context=proxy_ssl_context
) as http:
    r = http.request('GET', 'https://example.com/')
    print(r)
    # ...
    # httpcore.ConnectError: [SSL: UNEXPECTED_MESSAGE] unexpected message (_ssl.c:997)

The one bit that doesn't currently work is the sync case with an HTTPS proxy connecting to an HTTPS endpoint.

This isn't surprising...

Here's @florimondmanca's comment on this from encode/httpx#1434

this is now supported in urllib3 as of 1.26. It landed via this pull request: urllib3/urllib3#1923. AFAICT they had to implement TLS-in-TLS themselves, overriding the standard http.client connection implementation because that one doesn't support TLS-in-TLS.

And a link to the urllib3 SSLTransport class which supports TLS-in-TLS.

Upshot... in order to make this work our SyncBackend class would need updating in order to support TLS-in-TLS.

@karpetrosyan
Copy link
Member Author

For minimal changes, I believe we should implement SyncTLSStream and return it from SyncStream.start_tls.

@tomchristie
Copy link
Member

For minimal changes, I believe we should implement SyncTLSStream and return it from SyncStream.start_tls.

Agreed, yes.

@karpetrosyan
Copy link
Member Author

Perhaps we should rename SyncTLSStream to TLSinTLSStream because SyncStream can also hold a TLS socket.

@tomchristie
Copy link
Member

tomchristie commented Jun 14, 2023

Ohhh... nice work. I'd suggest that the OverallTimeout is overcomplicated.

Given the trade-off between clarity of the codebase & simpler operation vs some-slight-subtleties-in-timeout-behaviour-for-tls-in-tls, I'd recommend we drop that.

@karpetrosyan
Copy link
Member Author

But do we have to stick with this timeout strategy? I'm talking about decreasing the timeout with each tcp operation.

@tomchristie
Copy link
Member

I see it, but... at least from my perspective, KISS is really important before getting into the fancy stuff.

In terms of review it'd be beneficial to...

  • Start off with a simple-as-possible-to-review unit of work.
  • Figure out how we're going to implement the tests / coverage and get that done.
  • Once those are done we then have a mergeable pull request, and can consider additional smarts as an iterative improvement.

@karpetrosyan
Copy link
Member Author

Isn't this a simple-as-possible-to-review unit of work change? which should be our next step,

@tomchristie
Copy link
Member

Hrm.

Okay different question... in order to get this into a mergeable state we need test coverage.

Since we have the backends documented and part of the public API, testing them at that layer would be the sharpest option.
We'll want...

  • Tests for the backends, without TLS.
  • Tests for the backends, with TLS.
  • Tests for the backends, with TLS-in-TLS.

I suppose the iterative approach would be to start with implementing the just the first of those in a PR independent of this work.

@karpetrosyan
Copy link
Member Author

But are we prepared for tests and coverage? If you want to remove "OverallTimeout" we must first re-implement timeouts and then work on tests/coverage.

@tomchristie
Copy link
Member

I've opened an issue for this at #722 so that we're documenting our work neatly.

@karpetrosyan
Copy link
Member Author

Since the tls-in-tls implementation has moved to #732 , I'll open another PR to work through this more carefully.

@karpetrosyan
Copy link
Member Author

Re-opened in #745

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants