Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some fetch request headers don't add properly #24903

Open
pshaughn opened this issue Nov 28, 2019 · 9 comments
Open

Some fetch request headers don't add properly #24903

pshaughn opened this issue Nov 28, 2019 · 9 comments
Labels
A-content/dom Interacting with the DOM from web content A-network

Comments

@pshaughn
Copy link
Member

A few WPT tests fail because trying to put headers in a request does the wrong thing in some subcases.

fetch/api/basic/forbidden-headers: Servo stops a script from injecting an Accept-Encoding, Access-Control-Request-Headers, or Access-Control-RequestMethod header by throwing an exception, where the expected behavior is to just silently drop the header.
fetch/api/basic/request-headers-case: Nonstandard request header names are getting case-folded. I think this might be a problem with using Hyperium's implementation of HeaderMap, which is focused on holding a very "normative" form of headers and might not be appropriate for a client that exposes Web APIs.
fetch/api/basic/request-referrer: It looks like trying to set a referrer header to "about:client" sets it to null instead, instead of leaving it as the actual refererrer URL.

@jdm jdm added A-content/dom Interacting with the DOM from web content A-network labels Nov 28, 2019
@pshaughn
Copy link
Member Author

pshaughn commented Dec 20, 2019

The test cases in fetch/api/basic/request-forbidden-headers.any.js are failing not because of anything to do with those three particular header names, but because the test is trying to set "" as the header values in these cases, which Servo currently disallows but should allow if it wants to pass this and some other WPT tests.

// There is some unresolved confusion over the definition of a name and a value.
// The fetch spec [1] defines a name as "a case-insensitive byte
// sequence that matches the field-name token production. The token
// productions are viewable in [2]." A field-name is defined as a
// token, which is defined in [3].
// ISSUE 1:
// It defines a value as "a byte sequence that matches the field-content token production."
// To note, there is a difference between field-content and
// field-value (which is made up of field-content and obs-fold). The
// current definition does not allow for obs-fold (which are white
// space and newlines) in values. So perhaps a value should be defined
// as "a byte sequence that matches the field-value token production."
// However, this would then allow values made up entirely of white space and newlines.
// RELATED ISSUE 2:
// According to a previously filed Errata ID: 4189 in [4], "the
// specified field-value rule does not allow single field-vchar
// surrounded by whitespace anywhere". They provided a fix for the
// field-content production, but ISSUE 1 has still not been resolved.
// The production definitions likely need to be re-written.
// [1] https://fetch.spec.whatwg.org/#concept-header-value
// [2] https://tools.ietf.org/html/rfc7230#section-3.2
// [3] https://tools.ietf.org/html/rfc7230#section-3.2.6
// [4] https://www.rfc-editor.org/errata_search.php?rfc=7230
//
// As of December 2019 WHATWG, isn't even using grammar productions for value;
// https://fetch.spec.whatg.org/#concept-header-value just says not to have
// newlines, nulls, or leading/trailing whitespace.

@pshaughn
Copy link
Member Author

Servo's implementation of https://fetch.spec.whatwg.org/#dom-request successfully sets the referrer to NetTraitsReferrer::Client, but nothing else in Servo acts on that value. https://w3c.github.io/webappsec-referrer-policy/#determine-requests-referrer step 3 is what's supposed to use it, but I can't find step 3 in Servo's implementation.

/// <https://w3c.github.io/webappsec-referrer-policy/#determine-requests-referrer>
/// Steps 4-6.
pub fn determine_request_referrer(
This starts at step 4, and its caller is
// Step 8.
{
let referrer_url = match mem::replace(&mut request.referrer, Referrer::NoReferrer) {
Referrer::NoReferrer => None,
Referrer::Client => {
// FIXME(#14507): We should never get this value here; it should
// already have been handled in the script thread.
request.headers.remove(header::REFERER);
None
},
Referrer::ReferrerUrl(url) => {
request.headers.remove(header::REFERER);
let current_url = request.current_url();
determine_request_referrer(
&mut request.headers,
request.referrer_policy.unwrap(),
url,
current_url,
)
},
};
if let Some(referrer_url) = referrer_url {
request.referrer = Referrer::ReferrerUrl(referrer_url);
}
}
which is written under the impression that the Client value should have been handled at an earlier point.

@pshaughn
Copy link
Member Author

pshaughn commented Dec 21, 2019

https://github.com/servo/servo/blob/master/tests/wpt/web-platform-tests/fetch/api/headers/header-values.html and https://fetch.spec.whatwg.org/#concept-header-value indicate that a header value is allowed to contain octets that aren't even valid UTF-8. Servo definitely does not expect this, and when I change the validation code to allow it, Headers::fill crashes trying to unwrap value.to_str(). Assuming the spec and test are as intended, it will be necessary to comb for every codepath that expects header values to be legal UTF-8. Other browsers do pass this test.

https://github.com/hyperium/http/blob/455f33e4d8a1e799c4081a08d912c516ce5b8680/src/header/value.rs#L12-L18 corroborates.

@Darkspirit
Copy link
Sponsor Contributor

Darkspirit commented Dec 21, 2019

forbidden-headers: Hyper seems to support empty header values, that should be fine.


fetch/api/basic/request-headers-case: Nonstandard request header names are getting case-folded.

WTP tests use HTTP/1.1. HTTP/2 headers are lowercase. Hyper uses lowercase headers for HTTP/1.1 by default, too, but Servo uses a config option to title case all HTTP/1.1 headers for compatibility reasons.

request-headers-case:

  • It seems the test wants a whitespace after a comma here:

    combined_value.push(b',');

    Adding another combined_value.push(b' '); fixes this part. (https://bugzilla.mozilla.org/show_bug.cgi?id=1346313)

  • HeaderName is not compared case-insensitively, but it must be.

    HTTP/2 standard: https://tools.ietf.org/html/rfc7540#page-53

    Just as in HTTP/1.x, header field names are strings of ASCII
    characters that are compared in a case-insensitive fashion. However,
    header field names MUST be converted to lowercase prior to their
    encoding in HTTP/2. A request or response containing uppercase
    header field names MUST be treated as malformed (Section 8.1.2.6).

assert_regexp_match(body, /THIS-is-A-test: 1, 2/)

If you add an i for case-insensitive comparison, the test passes: assert_regexp_match(body, /THIS-is-A-test: 1, 2/i).

@pshaughn
Copy link
Member Author

HTTP standards often states that things like different string casing are semantically identical. My interpretation of this semantic equivalence, in terms of meeting WPT compliance goals when all major browsers already pass a given test, is that those things should have no effect on network behavior when executing HTTP protocols, but should still be preserved in a raw form and exposed verbatim to Javascript code unless the appropriate WHATWG standard says they're exposed in a normalized form.

@pshaughn
Copy link
Member Author

on the NetTraitsReferrer::Client front, #14507 comes into play.

Referrer::Client => {
// FIXME(#14507): We should never get this value here; it should
// already have been handled in the script thread.
request.headers.remove(header::REFERER);
None
},

This in turn seems to depend on "settings objects", as also needed by e.g. #25079

@jdm
Copy link
Member

jdm commented Dec 21, 2019

So far we have avoided introducing an explicit settings object concept for fetches in favour of providing the data the fetch uses from them, like explicit origins.

@Darkspirit
Copy link
Sponsor Contributor

Darkspirit commented Dec 21, 2019

The compatibility concern was only about a few broken HTTP/1.1 servers:
whatwg/fetch#476 changed the fetch spec to stop forcibly lowercasing headers for now when sending HTTP/1.1 requests (even though HTTP/2 headers are anyway lowercase) and instead retain the first type of spelling for a given header name. There is a general desire to revert this back to an unified behavior at some point. Methods kept lowercasing headers:

valid_name = valid_name.to_lowercase();

Set() and Append() lowercase header names, but Delete() and Get() do not yet. Firefox lowercases them, too. Lowercasing could be done at one place, in validate_name().

Go seems to have the same sane behavior as Hyper and gets away with it as it correctly assumed that underlying header implementations of broken servers are fixed once HTTP/2 support is added: golang/go#5022
Go and hyper correctly send THIS-IS-A-TEST as This-Is-A-Test: https://play.golang.org/p/onCY3_MHGAe Some operators of broken legacy applications complained that they don't want to fix their compatibility issue and instead demanded that a Go proxy software should support case sensitive HTTP/1.1 headers, but it was successfully denied so far. They need to fix their applications for HTTP/2 and HTTP/3 anyway.

@pshaughn
Copy link
Member Author

hyperium/http#376 is at least one reason why we're still not passing header-values.html, though maybe not the only one.

bors-servo pushed a commit that referenced this issue Dec 24, 2019
Header values no longer have to be ASCII or UTF-8

<!-- Please describe your changes on the following line: -->
This passes some failed tests related to header validity when handling ByteStrings outside the printable ASCII range. A few failures remain because the HeaderValue class is stricter than WHATWG/WPT, disallowing various control-code bytes that the spec and tests expect to be allowed.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix some of the test cases described in #24903

<!-- Either: -->
- [X] There are tests for these changes OR

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->
bors-servo pushed a commit that referenced this issue Jan 1, 2020
Header values no longer have to be ASCII or UTF-8

<!-- Please describe your changes on the following line: -->
This passes some failed tests related to header validity when handling ByteStrings outside the printable ASCII range. A few failures remain because the HeaderValue class is stricter than WHATWG/WPT, disallowing various control-code bytes that the spec and tests expect to be allowed.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix some of the test cases described in #24903

<!-- Either: -->
- [X] There are tests for these changes OR

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->
bors-servo pushed a commit that referenced this issue Jan 3, 2020
Header values no longer have to be ASCII or UTF-8

<!-- Please describe your changes on the following line: -->
This passes some failed tests related to header validity when handling ByteStrings outside the printable ASCII range. A few failures remain because the HeaderValue class is stricter than WHATWG/WPT, disallowing various control-code bytes that the spec and tests expect to be allowed.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix some of the test cases described in #24903

<!-- Either: -->
- [X] There are tests for these changes OR

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->
bors-servo pushed a commit that referenced this issue Jan 3, 2020
Header values no longer have to be ASCII or UTF-8

<!-- Please describe your changes on the following line: -->
This passes some failed tests related to header validity when handling ByteStrings outside the printable ASCII range. A few failures remain because the HeaderValue class is stricter than WHATWG/WPT, disallowing various control-code bytes that the spec and tests expect to be allowed.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix some of the test cases described in #24903

<!-- Either: -->
- [X] There are tests for these changes OR

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-content/dom Interacting with the DOM from web content A-network
Projects
Development

No branches or pull requests

3 participants