Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use existing host header when possible #172

Conversation

smarusa
Copy link
Contributor

@smarusa smarusa commented Oct 4, 2021

  • Remove Host header only if redirecting to a new host.
  • Clarify logic to operate on full host instead of hostname to keep/remove Host and Authorization headers. Host header is hostname+port. previousHostName could have been a Host or a Hostname.

Our company uses a dns caching library and makes requests by ip with the original hostname in the Host header. When following a redirect that Host header was removed always and for https requests this caused an SSL SAN error because the certificate SAN did not match IP.

subjectAltName does not match 111.111.111.111
SSL: no alternative certificate subject name matches target host name '111.111.111.111'

This change is to remove the configured Host header only if the redirect does not use the same host so that the configured Host header will be used to follow redirects when possible.

Further clarification of the problem this PR addresses:

  1. Service uses DNS cache to submit request by IP.
https://111.111.111.111/old/path
- H "Host: example.com"
  1. Server responds with redirect
- H "location: https://111.111.111.111/new/path"
or
- H "location: /new/path"

Previously the follow redirects library would request:

https://111.111.111.111/new/path

This would cause an error with https because host would not match the subject alt name in the certificate.

Changes in this PR will cause the new request to use the existing host header if the host does not change (both cases of same host in redirect (match determined by Host header if exists) or relative redirect):

https://111.111.111.111/new/path
- H "Host: example.com"

Copy link
Collaborator

@RubenVerborgh RubenVerborgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smarusa!

Your observation on host/hostname is definitely correct.

However, what I am missing from your problem description is that—if I understand correctly—the problem is when a redirect is issued to a relative URL. That's what I thought when reading this, and also saw in your tests (which are great, BTW). Could you confirm that this is indeed your problem? Perhaps you can put an example request/response in the description so we are sure about this situation.

In that case, my preferred solution is to still unconditionally remove all hOsT headers, which will return the actual host. Then, we set that host on redirectUrl, such that redirectUrl is correct. And then the correct host will become added from there. Plus, in the comments we should make it clear that this is for the relative URL case, something like:

// If the redirect is relative, carry over the host of the last request

index.js Outdated Show resolved Hide resolved
index.js Show resolved Hide resolved
index.js Show resolved Hide resolved
index.js Outdated Show resolved Hide resolved
test/test.js Outdated Show resolved Hide resolved
test/test.js Show resolved Hide resolved
test/test.js Show resolved Hide resolved
@smarusa
Copy link
Contributor Author

smarusa commented Oct 4, 2021

In that case, my preferred solution is to still unconditionally remove all hOsT headers, which will return the actual host. Then, we set that host on redirectUrl, such that redirectUrl is correct. And then the correct host will become added from there. Plus, in the comments we should make it clear that this is for the relative URL case, something like:

I've updated the comments providing example of the two use cases. Your suggested preference will work for us. It's slightly less efficient as we wouldn't be using the original request ip with host header and instead use the host in the response location and that may cause a dns hit but the redirect use case is relatively rare anyway. This would indeed fix the relative url case where the follow redirect was using the ip of the original request but not the host header causing the SSL SAN error.

Updated with your review comments

Thanks for quick response!

@smarusa smarusa force-pushed the redirect-with-host-header-when-possible branch from 5a91f2b to c99df10 Compare October 4, 2021 19:21
- Clarify logic to operate on host instead of hostname to keep/remove Host and Authorization headers. Host header is hostname+port.
@smarusa smarusa force-pushed the redirect-with-host-header-when-possible branch from c99df10 to a888775 Compare October 4, 2021 19:35
@smarusa smarusa removed their assignment Oct 4, 2021
@RubenVerborgh
Copy link
Collaborator

we wouldn't be using the original request ip with host header and instead use the host in the response location and that may cause a dns hit

But wouldn't those headers be the same then?

@smarusa
Copy link
Contributor Author

smarusa commented Oct 4, 2021

I think there may have been confusion as my original code didn't fully do what I expected when the full url was passed back in the location. I misunderstood and thought that url.resolve was not replacing the new url with the location host and using the currentUrl host. My intention was to make the follow request behave the same as the original request with the ip in the url and the host in the header.

Initial request

https://111.111.111.111/old/path
- H "host: example.com"

Server responds with redirect

- H "location: https://example.com/new/path"
or
- H "location: /new/path"

I intended for follow redirects to request (relative or not as long as the host in location matched host in initial request header):

https://111.111.111.111/new/path
- H "host: example.com"

In any case, I updated the code to your suggestion to behave the same way as before unless the location response is a relative url. In that case, for the new url host, use the host header in the original request instead of the host from the original url (which could be an ip).

https://example.com/new/path

@smarusa smarusa force-pushed the redirect-with-host-header-when-possible branch 2 times, most recently from ca6688b to 2cc405e Compare October 5, 2021 00:40
@smarusa smarusa force-pushed the redirect-with-host-header-when-possible branch from fb8e6e6 to fb9ab45 Compare October 5, 2021 00:45
@smarusa
Copy link
Contributor Author

smarusa commented Oct 13, 2021

@RubenVerborgh I updated the PR with code changes from your suggestions. Can you review again?

@RubenVerborgh
Copy link
Collaborator

Thanks, will do asap!

@RubenVerborgh RubenVerborgh self-assigned this Oct 13, 2021
@RubenVerborgh RubenVerborgh merged commit 2ad9e82 into follow-redirects:main Oct 30, 2021
@RubenVerborgh
Copy link
Collaborator

Thank you, released as part of v.1.14.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants