Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http_proxy is used for https:// URLs #11876

Open
cs278 opened this issue Mar 7, 2024 · 11 comments
Open

http_proxy is used for https:// URLs #11876

cs278 opened this issue Mar 7, 2024 · 11 comments
Labels
Milestone

Comments

@cs278
Copy link
Contributor

cs278 commented Mar 7, 2024

After configuring the http_proxy environment variable to proxy plain HTTP requests Composer starts failing because it uses the proxy server for encrypted HTTP requests. Whilst I appreciate there is no defined standard, I'm pretty sure Composer is deviating from the convention here but I could be wrong.

Output of composer diagnose:

$ http_proxy=http://10.0.17.44:80 php -f bin/composer diagnose
Checking composer.json: OK
Checking platform settings: OK
Checking git settings: OK git version 2.43.2
Checking http connectivity to packagist: OK
Checking https connectivity to packagist: FAIL
[Composer\Downloader\TransportException] curl error 56 while downloading https://repo.packagist.org/packages.json: Received HTTP code 400 from proxy after CONNECT
Checking HTTP proxy: FAIL
[Composer\Downloader\TransportException] curl error 56 while downloading https://repo.packagist.org/packages.json: Received HTTP code 400 from proxy after CONNECT
Checking github.com oauth access: FAIL
[Composer\Downloader\TransportException] curl error 56 while downloading https://api.github.com/: Received HTTP code 400 from proxy after CONNECT
Checking disk free space: OK
Checking Composer and its dependencies for vulnerabilities: WARNING
Failed performing audit: curl error 56 while downloading https://packagist.org/api/security-advisories/: Received HTTP code 400 from proxy after CONNECT
Composer version: 2.7.999-dev+source
PHP version: 7.2.5 - Package overridden via config.platform, actual: 7.2.34
PHP binary path: /usr/bin/php7.2
OpenSSL version: OpenSSL 3.0.2 15 Mar 2022
cURL version: 7.81.0 libz 1.2.11 ssl OpenSSL/3.0.2
zip: extension present, unzip present, 7-Zip present (7z)

curl/wget both seem to operate as I expect:

$ http_proxy=http://10.0.17.44:80 curl -I http://repo.packagist.org/packages.json
HTTP/1.1 200 OK
Server: nginx/1.18.0 (Ubuntu)
Date: Thu, 07 Mar 2024 18:42:39 GMT
Content-Type: application/json
Connection: keep-alive
Last-Modified: Thu, 07 Mar 2024 18:33:07 GMT
Vary: Accept-Encoding
ETag: W/"65ea0863-482"

$ http_proxy=http://10.0.17.44:80 curl -I https://repo.packagist.org/packages.json
HTTP/2 200 
server: nginx
date: Thu, 07 Mar 2024 18:42:28 GMT
content-type: application/json
last-modified: Thu, 07 Mar 2024 18:33:07 GMT
vary: Accept-Encoding
cache-control: private, max-age=0, no-cache
$ http_proxy=http://10.0.17.44:80 wget -qO/dev/null --server-response http://repo.packagist.org/packages.json
  HTTP/1.1 200 OK
  Server: nginx/1.18.0 (Ubuntu)
  Date: Thu, 07 Mar 2024 18:39:16 GMT
  Content-Type: application/json
  Transfer-Encoding: chunked
  Connection: keep-alive
  Last-Modified: Thu, 07 Mar 2024 18:33:07 GMT
  Vary: Accept-Encoding
  ETag: W/"65ea0863-482"

$ http_proxy=http://10.0.17.44:80 wget -qO/dev/null --server-response https://repo.packagist.org/packages.json
  HTTP/1.1 200 OK
  Server: nginx
  Date: Thu, 07 Mar 2024 18:39:23 GMT
  Content-Type: application/json
  Last-Modified: Thu, 07 Mar 2024 18:33:07 GMT
  Transfer-Encoding: chunked
  Connection: keep-alive
  Vary: Accept-Encoding
  Cache-Control: private, max-age=0, no-cache

I've tracked the problem down to a single line of code:

$httpsProxy = $httpProxy;

However it occurs to me that people might be relying on the behaviour so fixing it might not be as straightforward as removing that line.

References: https://about.gitlab.com/blog/2021/01/27/we-need-to-talk-no-proxy/

@Seldaek
Copy link
Member

Seldaek commented Mar 8, 2024

I'm pretty sure we did this as some people only define http_proxy and most urls are these days https.. So indeed probably not a good idea to change at this point.

Is there any way to set no_proxy to match https://* or smth perhaps?

@Seldaek Seldaek added the Support label Mar 8, 2024
@cs278
Copy link
Contributor Author

cs278 commented Mar 8, 2024

FWIW npm appears to work the same as curl/wget.

It seems there is no way I can find to tell Composer to only use the proxy for HTTP, I've tried various things like this with no luck so IMO there is definitely a bug.

$ http_proxy=http://10.0.17.44:80 https_proxy= no_proxy="https://*" php -f bin/composer diagnose 
Checking composer.json: OK
Checking platform settings: OK
Checking git settings: OK git version 2.43.2
Checking http connectivity to packagist: OK
Checking https connectivity to packagist: FAIL
[Composer\Downloader\TransportException] curl error 56 while downloading https://repo.packagist.org/packages.json: Received HTTP code 400 from proxy after CONNECT
Checking HTTP proxy: FAIL
[Composer\Downloader\TransportException] curl error 56 while downloading https://repo.packagist.org/packages.json: Received HTTP code 400 from proxy after CONNECT
Checking github.com oauth access: FAIL
[Composer\Downloader\TransportException] curl error 56 while downloading https://api.github.com/: Received HTTP code 400 from proxy after CONNECT
Checking disk free space: OK
Checking Composer and its dependencies for vulnerabilities: WARNING
Failed performing audit: curl error 56 while downloading https://packagist.org/api/security-advisories/: Received HTTP code 400 from proxy after CONNECT
Composer version: 2.7.999-dev+source
PHP version: 7.2.5 - Package overridden via config.platform, actual: 7.2.34
PHP binary path: /usr/bin/php7.2
OpenSSL version: OpenSSL 3.0.2 15 Mar 2022
cURL version: 7.81.0 libz 1.2.11 ssl OpenSSL/3.0.2
zip: extension present, unzip present, 7-Zip present (7z)

@Seldaek
Copy link
Member

Seldaek commented Mar 11, 2024

Yeah https_proxy being set empty is just seen as not being set, perhaps we could at least fix that, so you can explicitly set it empty to prevent it reusing http_proxy, as it shows you are aware https_proxy exists.. Would you like to try and send a PR for this?

@Seldaek Seldaek added Bug and removed Support labels Mar 11, 2024
@Seldaek Seldaek added this to the 2.7 milestone Mar 11, 2024
@johnstevenson
Copy link
Member

johnstevenson commented Mar 15, 2024

http_proxy is used for https URLS

There is absolutely nothing wrong in doing this.

Before the https_proxy (or HTTPS_PROXY) environment variables popped up, everyone happily used http_proxy to make https requests using the HTTP CONNECT method.

I've tracked the problem down to a single line of code: $httpsProxy = $httpProxy

This has nothing to do with the problem you are experiencing. In your example, using http_proxy=http://10.0.17.44:80, Composer ends up with the following data:

$this->fullProxy = [
    "http" => "http://10.0.17.44:80",
    "https" => "http://10.0.17.44:80",
]

It uses this data to select the correct proxy for the scheme of the request url, so in your case it will select the same proxy url for both http and https requests. Furthermore, we set the proxy url in curl using the CURLOPT_PROXY option, which prevents curl from using environment variables.

We designed it this way to handle the confusion over what each environment variable means and what it is expected to do. Essentially it doesn't actually matter what these variables are called: the important bit for Composer is the scheme of the proxy, due to limitations with streams and older curl versions.

However it occurs to me that people might be relying on the behaviour

This is the correct behaviour. If you change it then every proxy user who has not set an https_proxy (or HTTPS_PROXY) variable will be unable to make https requests.

You obviously have an issue, but I don't think it is to do with Composer. I hope the above information helps you track it down.

I've tested this with PHP-8.3.4, which uses the latest curl version 8.6.0 and it still works as intended:

$ http_proxy=http://localhost:6180 composer diagnose
Checking platform settings: OK
Checking git settings: OK git version 2.41.0
Checking http connectivity to packagist: OK
Checking https connectivity to packagist: OK
Checking HTTP proxy: OK
Checking github.com rate limit: OK
Checking disk free space: OK
Checking pubkeys:
Tags Public Key Fingerprint: 57815BA2 7E54DC31 7ECC7CC5 573090D0  87719BA6 8F3BB723 4E5D42D0 84A14642
Dev Public Key Fingerprint: 4AC45767 E5EC2265 2F0C1167 CBBB8A2B  0C708369 153E328C AD90147D AFE50952
OK
Checking Composer version: OK
Checking Composer and its dependencies for vulnerabilities: OK
Composer version: 2.7.2
PHP version: 8.3.4
PHP binary path: C:\php-8.3.4\php.exe
OpenSSL version: OpenSSL 3.0.13 30 Jan 2024
cURL version: 8.6.0 libz 1.2.12 ssl OpenSSL/3.0.13
zip: extension present, unzip present, 7-Zip present (7z)

@cs278
Copy link
Contributor Author

cs278 commented Mar 19, 2024

Thanks for the detailed response @johnstevenson.

Could you please explain how I can configure Composer to use a proxy for HTTP and not HTTPS? I have a proxy which purposefully does not permit the use of HTTP CONNECT.

This is the correct behaviour.

So Composer works differently to curl and a lot of other tools/libraries? It would definitely worthwhile documenting this.

@johnstevenson
Copy link
Member

Thanks for explaining your proxy config. Is there a reason for not using HTTP_CONNECT?

It would definitely worthwhile documenting this.

Here is the documentation from 2012:

If you are using composer from behind an HTTP proxy, you can use the standard

`http_proxy` or `HTTP_PROXY` env vars. Simply set it to the URL of your proxy.
Many operating systems already set this variable for you.

Using `http_proxy` (lowercased) or even defining both might be preferrable since
some tools like git or curl will only use the lower-cased `http_proxy` version.

And it hasn't changed much since: https://getcomposer.org/doc/03-cli.md#http-proxy-or-http-proxy

Could you please explain how I can configure Composer to use a proxy for HTTP and not HTTPS?

By default, Composer uses https (unless config disable-tls is set) so you don't actually need to use a proxy.

@Seldaek The landscape seems to have changed a little since 2020, in terms of specific https_proxy usage. Back then, C# and dotnet preferred http_proxy, but from a quick trawl through the dotnet repos they nearly all require https_proxy for https connections. Ruby is still sticking to http_proxy, though.

It might now make sense to follow suit and exactly match curl, particularly as they now support CIDR ranges in no_proxy, which they didn't in 2020. Perhaps we could show a warning for a period of time that https_proxy will be required after a certain date?

@Seldaek
Copy link
Member

Seldaek commented Mar 27, 2024

Yes I would tend to agree, but it's a bit problematic as it will risk causing breakage for people not defining HTTPS_PROXY. I'm not sure how we can best handle this, maybe support setting an empty HTTPS_PROXY first - without overwriting it with the http_proxy value - so that we enable that use case (assuming curl and others handle that well). Then we could warn if https_proxy is unset and http_proxy is set, but keep the "copy" mechanism in place for now, and then eventually remove that?

@cs278
Copy link
Contributor Author

cs278 commented Mar 27, 2024

Is there a reason for not using HTTP_CONNECT?

Quite simply nginx doesn't support it.

By default, Composer uses https (unless config disable-tls is set) so you don't actually need to use a proxy.

I know. My use case is in a CI environment we started setting http_proxy= to cache APT downloads and a few other things that use plain connections, other tools work just fine Composer is the one thing that doesn't work as expected.

Ruby is still sticking to http_proxy, though.

Ruby uses the scheme to determine which proxy to use, although the docs are not very clear on the matter.

require 'uri'

['http://example.com/', 'https://example.com/'].each { |url|
    print url, " - ", URI.parse(url).find_proxy, "\n"
}
$ http_proxy=http://proxy:8123/ ruby test.rb                                                                                                                       
http://example.com/ - http://proxy:8123/
https://example.com/ - 
$ https_proxy=http://proxy:8123/ ruby test.rb 
http://example.com/ - 
https://example.com/ - http://proxy:8123/
$ ftp_proxy=http://proxy:8123/ ruby test.rb 
http://example.com/ - 
https://example.com/ - 

@johnstevenson
Copy link
Member

Ruby uses the scheme to determine which proxy to use

Ah, thanks for clarifying that.

Quite simply nginx doesn't support it.

That's because nginx is a reverse proxy, not a forward proxy. The purpose of the http_proxy environment variables is to inform tools that a forward proxy is being used and to act accordingly.

@Seldaek
Copy link
Member

Seldaek commented Apr 17, 2024

#11915 prepares the terrain to get this fixed in 2.8.

@Seldaek Seldaek modified the milestones: 2.7, 2.8 Apr 17, 2024
@cs278
Copy link
Contributor Author

cs278 commented Apr 17, 2024

#11915 prepares the terrain to get this fixed in 2.8.

👍 Brilliant, thanks both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants