Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect system proxy exclusions. #1536

Open
1 task done
tomchristie opened this issue Mar 25, 2021 · 10 comments
Open
1 task done

Respect system proxy exclusions. #1536

tomchristie opened this issue Mar 25, 2021 · 10 comments
Labels
requests-compat Issues related to Requests backwards compatibility wontfix
Milestone

Comments

@tomchristie
Copy link
Member

We're currently leaning on urllib.request.getproxies() to determine the system proxy setup, and setup which mounts should be a proxy transport and which should be a regular transport.

However, we're not using urllib.request.proxy_bypass(host).

This all works as expected when environment settings are being used. HTTP_PROXY, HTTPS_PROXY, and ALL_PROXY. In that case we're reading NO_PROXY, and ensuring anything hostname patterns there are mounted as a regular transport...

httpx/httpx/_utils.py

Lines 304 to 320 in 68cf1ff

no_proxy_hosts = [host.strip() for host in proxy_info.get("no", "").split(",")]
for hostname in no_proxy_hosts:
# See https://curl.haxx.se/libcurl/c/CURLOPT_NOPROXY.html for details
# on how names in `NO_PROXY` are handled.
if hostname == "*":
# If NO_PROXY=* is used or if "*" occurs as any one of the comma
# seperated hostnames, then we should just bypass any information
# from HTTP_PROXY, HTTPS_PROXY, ALL_PROXY, and always ignore
# proxies.
return {}
elif hostname:
# NO_PROXY=.google.com is marked as "all://*.google.com,
# which disables "www.google.com" but not "google.com"
# NO_PROXY=google.com is marked as "all://*google.com,
# which disables "www.google.com" and "google.com".
# (But not "wwwgoogle.com")
mounts[f"all://*{hostname}"] = None

However, in the case when none of those environment variables are set getproxies() instead falls back to system proxy configuration. For windows this is registry based. ProxyEnable and ProxyOverride. For Mac this is sysconf based.
In those cases, we're correctly getting the configured proxies, but we aren't dealing with proxy exclusions.

We'd like to be able to setup these exclusions with our neat hostname pattern matched mounts system, which actually
means we can't just fallback to urllib.request.proxy_bypass(host), because that needs to be called per-host.

So, first steps...

  • What exactly is the format of the windows registry ProxyOverride field?
  • What exactly is the format of the "exceptions" field returned by from _scproxy import _get_proxy_settings()?
@tomchristie tomchristie added the requests-compat Issues related to Requests backwards compatibility label Mar 25, 2021
@tomchristie
Copy link
Member Author

tomchristie commented Mar 25, 2021

  1. The windows registry ProxyOverride field.

A useful starting point is here...

https://github.com/python/cpython/blob/030a713183084594659aefd77b76fe30178e23c8/Lib/urllib/request.py#L2746

        proxyOverride = proxyOverride.split(';')
        # now check if we match one of the registry values.
        for test in proxyOverride:
            if test == '<local>':
                if '.' not in rawHost:
                    return 1
            test = test.replace(".", r"\.")     # mask dots
            test = test.replace("*", r".*")     # change glob sequence
            test = test.replace("?", r".")      # change glob char
  1. The "exceptions" field returned by from _scproxy import _get_proxy_settings()

On my system this returns...

>>> _get_proxy_settings()
{'exclude_simple': False, 'exceptions': ('*.local', '169.254/16',)}

There's also an example documented in the urllib source code here...

https://github.com/python/cpython/blob/030a713183084594659aefd77b76fe30178e23c8/Lib/urllib/request.py#L2556

{ 'exclude_simple': bool, 'exceptions': ['foo.bar', '*.bar.com', '127.0.0.1', '10.1', '10.0/16']}

So, "10.0/16" and '169.254/16' here are not IPs, but IP ranges. Those are a bit awkward for us since we don't currently support subnet matching on transport mounts.

@amchii
Copy link

amchii commented Mar 29, 2021

Thanks!

I think httpx can drop the support for system proxy settings, only uses environment settings:

  1. It's hard to handle system proxy exclusions for httpx's mounts system.
  2. If you use the system proxy settings, you should check the system proxy-bypass settings, otherwise it's not correct.

Besides, for the field 'exceptions' ,'10.1'='10.1/16'='10.1.1.1/16'.If httpx needs to mount system proxy-bypass settings on Windows and macosx, which can be unified into form like 10.1.* , then calls socket.gethostbyname to check both hostname and ip for request url, or simply not calls socket.gethostbyname which means DNS lookups is not supported.

@tomchristie
Copy link
Member Author

Sticking to environment only settings would be one option, yes, though I'm not convinced that'd be the best from a user-experiance point of view.

@tomchristie tomchristie added this to the v1.0 milestone Apr 29, 2021
@Jaharmi
Copy link

Jaharmi commented May 22, 2021

Being able to pick up and use the system proxy settings — especially proxy auto config — would be a benefit in certain environments, even if it’s an option and not the default. I’m no longer in a situation like that, but Python PAC support on macOS could have made configuring some projects much easier. It might have been a reason to choose a library like HTTPX over another.

@tomchristie tomchristie modified the milestones: v1.0, v1.1 Jan 22, 2022
@stale
Copy link

stale bot commented Feb 24, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Feb 24, 2022
@tomchristie
Copy link
Member Author

Still valid at the moment. Could do with a review and possibly extra docs.

@stale stale bot removed the wontfix label Feb 24, 2022
@stale
Copy link

stale bot commented Mar 27, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Mar 27, 2022
@tomchristie
Copy link
Member Author

Upping the durations on you, @stalebot. Shoo.

@stale stale bot removed the wontfix label Mar 28, 2022
@stale
Copy link

stale bot commented Oct 15, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Oct 15, 2022
@piankma
Copy link

piankma commented Mar 6, 2024

Nothing's changed in this topic i presume?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
requests-compat Issues related to Requests backwards compatibility wontfix
Projects
None yet
Development

No branches or pull requests

4 participants