Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve methodRewriting option #2136

Closed
1 task done
stjosh opened this issue Sep 8, 2022 · 12 comments
Closed
1 task done

Improve methodRewriting option #2136

stjosh opened this issue Sep 8, 2022 · 12 comments

Comments

@stjosh
Copy link
Contributor

stjosh commented Sep 8, 2022

What problem are you trying to solve?

The documentation correctly states:

If a 303 is sent by the server in response to any request type (POST, DELETE, etc.), Got will request the resource pointed to in the location header via GET.
This is in accordance with the specification.

However, Section 6.4.3 of the same RFC also states:

Note: For historical reasons, a user agent MAY change the request method from POST to GET for the subsequent request. If this behavior is undesired, the 307 (Temporary Redirect) status code can be used instead.

And indeed, I observe that even modern browsers like Chrome and/or Firefox do actually interpret a 302 in response to a POST Request as if it was a 303. I furthermore observe that there are still quite some servers around (including some I have to work with at my workplace) that do send 302 responses even though they should actually send 303 - be it because they need to handle legacy (i.e., pre-HTTP/1.1) clients which do not understand a 303 status code, or simply because they can count on the Browsers to do the right thing.

Describe the feature

For working with servers which do behave as described above (i.e., which send 302 responses when they actually should send a 303), it would be great to have an option which enables "legacy-handling" of 302 responses, i.e., which also changes the request method to GET and clears all request body contents when receiving a 302 response status code.

Currently, I always need to do a workaround like this:

got.post('http://foo.bar', {
    ...
    hooks: {
        beforeRedirect: [
            (options, response) => {
                if (response.statusCode === 302) {
                    options.method = 'GET'
                    options.json = undefined
                }
            }
        ]
    }
})

Checklist

  • I have read the documentation and made sure this feature doesn't already exist.
@szmarczak
Copy link
Collaborator

Got is not a browser, but a modern HTTP client so this is very unlikely.

@stjosh
Copy link
Contributor Author

stjosh commented Sep 8, 2022

I am very aware that got is not a browser 😄 However, as it needs to deal with all sort of HTTP servers - modern and old-fashioned, and the servers might not implement the full HTTP/1.1 standard for compatibility reasons, I still think this is a worthwhile extension of a "disabled-by-default" option.

E.g., there's also the allowGetBody-Option where the documentation states:

This option is only meant to interact with non-compliant servers when you have no other choice.

and

The RFC 7321 doesn't specify any particular behavior for the GET method having a payload, therefore it's considered an anti-pattern.

So, I think the same reasoning can be applied to add an option as I have described it (as indeed, I do have no other choice than to work with such non-compliant servers)

@szmarczak
Copy link
Collaborator

Given that you can work around this (the code you provided), is making this even more unlikely to be implemented in Got. allowGetBody had to be in Got since there was no way around it.

@stjosh
Copy link
Contributor Author

stjosh commented Sep 12, 2022

Many thanks for your responses @szmarczak - I do see and respect your point of view. As I think the basic point of our discussion should be RFC compliance, I did some further research, as the topic started to raise my curiosity.

TL;DR: Both approaches, i.e., changing HTTP methods after 301 and 302, or leaving them as-is, are compliant with RFC7231 "Hypertext Transfer Protocol (HTTP/1.1)". As the fully-fledged HTTP client Got aspires to be, I personally think that Got should give its users an option-driven way of handling this optionality the RFCs allow (i.e., without custom hooks) - especially since the default way of handling 301, 302 and 303 redirects seems to diverge from the vast majority of user-agents out there, including curl, the go standard library, python's urrlib, httpie, and all major browsers. Furthermore, it seems that other user-agents in general do handle method-switching the same on 301, 302 and 303 redirects - Got's decision to only switch methods on 303 but preserve the method on 301 and 302 seems to be at least uncommon.

I will gladly volunteer to implement this feature and submit a PR including doc and tests in case this has any chance of being accepted.

In case this feature request is not going to be accepted, I would suggest at least adding a short notice in the documentation including the hook-based workaround.

Long Story

RFC7231 Section 6.4 actually has a lengthy note about this topic (accentuation added by myself):

Note: In HTTP/1.0, the status codes 301 (Moved Permanently) and 302 (Found) were defined for the first type of redirect ([RFC1945], Section 9.3). Early user agents split on whether the method applied to the redirect target would be the same as the original request or would be rewritten as GET. Although HTTP originally defined the former semantics for 301 and 302 (to match its original implementation at CERN), and defined 303 (See Other) to match the latter semantics, prevailing practice gradually converged on the latter semantics for 301 and 302 as well. The first revision of HTTP/1.1 added 307 (Temporary Redirect) to indicate the former semantics without being impacted by divergent practice. Over 10 years later, most user agents still do method rewriting for 301 and 302; therefore, this specification makes that behavior conformant when the original request is POST.

Also, interestingly, it seems that curl (and, thereby, the immeasurably widely-used libcurl) - which I'd personally consider as kind of the de-facto standard when it comes to non-browser user-agents - made the choice to set the behavior of changing the HTTP method to GET on 301, 302 and even 303 redirects as its default behavior which needs to be overridden if desired - see the curl manpages and some more extensive reasoning on everything.curl.dev - here, the curl developers actually reason that it is the default case for servers to send 302 and expect the client to change methods and the exceptional case that they actually want the client to preserve the request method.

Further widely-used non-browser HTTP clients like the go standard library, python's urrlib implementation and httpie seem to have made the same design decision.

@szmarczak
Copy link
Collaborator

I'm not against an option, but we need more feedback. /cc @sindresorhus

@sindresorhus
Copy link
Owner

I'm ok with adding an option for it.

@stjosh
Copy link
Contributor Author

stjosh commented Sep 12, 2022

Ok, many thanks for your updates! I will submit a PR soonish 😃

@stjosh stjosh changed the title Add option to enable "legacy mode" for 302 redirects Add option to change HTTP Method to GET for 301 and 302 redirects Sep 12, 2022
@stjosh
Copy link
Contributor Author

stjosh commented Sep 12, 2022

@sindresorhus @szmarczak : Before I start the implementation, I'd appreciate your feedback on my basic idea to implement this (after some further thoughts):

I do propose to add an option named forcedGetRedirectCodes which takes an array of redirect codes (i.e., only 301, 302 and 303 would be accepted as array items).

The default value for this option would be [ 303 ] and this would not modify Got's current behavior of handling redirects. However, a user could, e.g., specify [ 301, 302, 303 ] to mimic curl's, default behavior, or also [] or null in the unlikely case the method shall never be changed.

I think this would be a flexible way of giving the user control when he needs it while preserving the basic decision to follow the intention of the HTTP/1.1 spec by default (which, indeed, would prefer to preserve the request method on 301 and 302).

Would you agree with this design? Any better proposals for the option's name than forcedGetRedirectCodes?

@szmarczak
Copy link
Collaborator

Sounds good 👍

@stjosh
Copy link
Contributor Author

stjosh commented Sep 13, 2022

Ok, bummer... While implementing, I found that there actually is already an option kind of doing what I need, i.e., the methodRewriting option. I am very sorry for having missed that so far, as I had looked for something near the "redirect" options and did not have methodRewriting on the radar (as also its documentation does never mention the word "redirect"). 🙈

So, this actually solves my initial problem. However, I'd propose two improvements:

  • First, I'd propose to update the docs to make it clear that this option relates to redirects - I'll open a separate PR for this.
  • Second, the methodRewriting option now also rewrites to GET for 307 and 308 responses, where method rewriting is explicitely prohibited by the specs. As I am sure that when a server sends 307 or 308, it understands the semantics (as these are new definitions and the whole argument of legacy client behavior does not count here), I'd suggest to alter the behavior of methodRewriting to only rewrite the method on 301, 302 or 303, which are the only allowed codes for this as specified. I.e., I'd actually classify this as a bug in the current methodRewriting implementation -> that would then probably qualify for a separate issue labelled as "bug".

@szmarczak
Copy link
Collaborator

PR welcome

@stjosh stjosh changed the title Add option to change HTTP Method to GET for 301 and 302 redirects Improve methodRewriting option Sep 14, 2022
@stjosh
Copy link
Contributor Author

stjosh commented Sep 14, 2022

See PR #2145

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants