You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently I started getting 'Please check proxy URL. It is malformed and could be missing the host.' which is caused by using proxy URL localhost:8100, which when parsed by urllib3 does not provide the host field.
I suspect this code to cause the regression: 9a8a826
Apparently having such proxy URL is undesired by requests lib, which enforces the host to be preset, with the explanation that proxy URL should conform to RFC3986, which contains authority section in the URL. Having authority section in the URL indeed guarantees that host is parsed:
I've opened an issue in urllib3: urllib3/urllib3#2523, however, I am rather sure it will be closed, since my expected behavior is probably against the RFC.
Expected Result
I would expect the following proxy URL to be supported:
Hi @tomers, thanks for bringing this up. You're right that urllib3 doesn't parse this case as having a host, and you'll find all future versions of Python, starting with 3.9, don't either. We made a recent change in #5917 to move from the standard library's urlparse to urllib3's parse_url to ensure we're keeping consistent behavior across all python versions which was a notable pain point.
You'll find starting in python 3.9, this is the same with our old method as well:
Given this url format is unsupported in every reasonable implementation we can use, explicitly for security reasons, I'm not sure we're inclined to fix this. Attempting to write our own parser is going to create even more surface area for potential security issues, so I don't believe that's an option either.
For now, simply adding http:// to the start of your proxy will resolve the issue. Otherwise, you can downgrade to a version of Requests which supports this. I'll leave this open for a bit to gauge impact but I think that's probably going to be our stance going forward.
I used to have a working code in which the proxy URL was provided without a scheme:
Recently I started getting 'Please check proxy URL. It is malformed and could be missing the host.' which is caused by using proxy URL
localhost:8100
, which when parsed by urllib3 does not provide the host field.I suspect this code to cause the regression: 9a8a826
Apparently having such proxy URL is undesired by requests lib, which enforces the host to be preset, with the explanation that proxy URL should conform to RFC3986, which contains authority section in the URL. Having authority section in the URL indeed guarantees that host is parsed:
I've opened an issue in urllib3: urllib3/urllib3#2523, however, I am rather sure it will be closed, since my expected behavior is probably against the RFC.
Expected Result
I would expect the following proxy URL to be supported:
Actual Result
Error is thrown:
Reproduction Steps
System Information
The text was updated successfully, but these errors were encountered: