Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupported proxy URL pattern: localhost:<port> #6032

Closed
tomers opened this issue Jan 6, 2022 · 2 comments
Closed

Unsupported proxy URL pattern: localhost:<port> #6032

tomers opened this issue Jan 6, 2022 · 2 comments

Comments

@tomers
Copy link

tomers commented Jan 6, 2022

I used to have a working code in which the proxy URL was provided without a scheme:

proxy_server = 'localhost:8100'
session = requests.Session()
session.proxies = dict(http=proxy_server, https=proxy_server)

Recently I started getting 'Please check proxy URL. It is malformed and could be missing the host.' which is caused by using proxy URL localhost:8100, which when parsed by urllib3 does not provide the host field.

I suspect this code to cause the regression: 9a8a826

Apparently having such proxy URL is undesired by requests lib, which enforces the host to be preset, with the explanation that proxy URL should conform to RFC3986, which contains authority section in the URL. Having authority section in the URL indeed guarantees that host is parsed:

>>> assert parse_url('localhost:8100').host == None
>>> assert parse_url('localhost.com:8100').host == 'localhost.com'
>>> assert parse_url('user@localhost:8100').host == 'localhost'

I've opened an issue in urllib3: urllib3/urllib3#2523, however, I am rather sure it will be closed, since my expected behavior is probably against the RFC.

Expected Result

I would expect the following proxy URL to be supported:

proxy_server = 'localhost:8100'
session = requests.Session()
session.proxies = dict(http=proxy_server, https=proxy_server)

Actual Result

Error is thrown:

Please check proxy URL. It is malformed and could be missing the host.

Reproduction Steps

import requests

assert requests.utils.prepend_scheme_if_needed('localhost:8100', 'http') == 'http://localhost:8100'

System Information

$ python -m requests.help
{
  "chardet": {
    "version": "3.0.4"
  },
  "charset_normalizer": {
    "version": "2.0.8"
  },
  "cryptography": {
    "version": "36.0.1"
  },
  "idna": {
    "version": "2.10"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.8.10"
  },
  "platform": {
    "release": "5.4.0-92-generic",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "101010df",
    "version": "19.1.0"
  },
  "requests": {
    "version": "2.27.1"
  },
  "system_ssl": {
    "version": "1010106f"
  },
  "urllib3": {
    "version": "1.25.11"
  },
  "using_charset_normalizer": false,
  "using_pyopenssl": true
}
@nateprewitt
Copy link
Member

Hi @tomers, thanks for bringing this up. You're right that urllib3 doesn't parse this case as having a host, and you'll find all future versions of Python, starting with 3.9, don't either. We made a recent change in #5917 to move from the standard library's urlparse to urllib3's parse_url to ensure we're keeping consistent behavior across all python versions which was a notable pain point.

You'll find starting in python 3.9, this is the same with our old method as well:

>>> from urllib.parse import urlparse
>>> urlparse('localhost:8000')
ParseResult(scheme='localhost', netloc='', path='8000', params='', query='', fragment='')

Given this url format is unsupported in every reasonable implementation we can use, explicitly for security reasons, I'm not sure we're inclined to fix this. Attempting to write our own parser is going to create even more surface area for potential security issues, so I don't believe that's an option either.

For now, simply adding http:// to the start of your proxy will resolve the issue. Otherwise, you can downgrade to a version of Requests which supports this. I'll leave this open for a bit to gauge impact but I think that's probably going to be our stance going forward.

@nateprewitt
Copy link
Member

Going to resolve since we haven’t seen any other activity on this. Thanks again for the report @tomers!

@psf psf locked as resolved and limited conversation to collaborators Jan 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants