Alternator's response to oversized requests confuses the client drivers #8195

nyh · 2021-03-02T07:21:05Z

Alternator places a limit (using Seastar's HTTP server feature of content_length_limit) on the request size. We have a test, test_manual_requests.py::test_too_large_request, of what happens when the request's length exceeds the limit. This test still passes because it sees an error in the request - as expected. However, the type of error changed - probably in recent changes to the HTTP server - and we didn't notice. The error that the Cassandra driver shows today when sent an over-sized request (with a Content-Length header saying its over-the-limit size) is:

E           botocore.exceptions.ConnectionClosedError: Connection was closed before we received a valid response from endpoint URL: "http://127.1.153.28:8000/".

Wireshark shows that the server read the Content-Length header, did not read the entire content, and instead sent a response:

149	6.356010328	127.1.124.69	127.0.0.1	HTTP	230	HTTP/1.1 413 Payload Too Large

But evidently the client did not read this response before noticing that the connection was closed and writes it already did got rejected.

It would be better if "common" clients like the Cassandra drivers or even Python's request.post (which currently reports a "broken pipe" in this scenario) would be able to report the 413 error instead of some generic I/O error. Remember that drivers are likely to retry on such generic I/O error, while a permanent 413 error could tell the client not to both retry.

This issue can be reproduced by the tests:

test_manual_requests.py::test_too_large_request_chunked
test_manual_requests.py::test_too_large_request_content_length

Perhaps when sending a response while the client is still sending the content, we should continue reading the content - although AFAIR the RFC is clear that we don't need to. Or, maybe it's enough to not read the content but also not close the connection immediately, to allow the client to hang on the write and read the response before noticing a broken pipe. However, I don't know if this actually works.

Because the content_length_limit is a Seastar feature, we will probably need to open a corresponding issue for Seastar.

The text was updated successfully, but these errors were encountered:

nyh · 2021-03-02T12:20:59Z

In #8196 @avikivity suggested that we should drop the content_length_limit feature from Seastar (because it doesn't correctly handle the cases where there is no Content-Length) and should implement this feature in Scylla. Such a decision will, of course, also be relevant to this issue.

nyh · 2021-03-02T14:41:52Z

I should note that this should probably be considered a bug in the Python HTTP client, not in the server. Other people have noticed this bug in the Python "requests" client too - see psf/requests#2422 (comment) and also issues 5296, 4062, 2422, 1574, 4445, 2490 in that project. The HTTP standard does allow a server to return a response and close the socket even before reading the entire request. In this case, when the client can't write any more ("broken pipe"), it still needs to read from the socket to get a response.

Nevertheless, DynamoDB doesn't have this problem. I don't know if it reads the entire request to completion - which is easy to do although a bit silly (do we do this even if the request is 100 MB?), or somehow has another trick to avoid the broken pipe on the client.

The HTTP RFC, while it allows sending a response before the request is fully read, mentions that even if the client is correctly written (which, as is now becoming evident, isn't the case), it is difficult to implement correctly in the HTTP server because of the "reset problem" explained in https://tools.ietf.org/html/rfc7230#section-6.6. Seastar's HTTP server probably doesn't handle this properly (it just closes the connection), which is another reason not to send a response before consuming the entire request. A quote from that link:

If a server performs an immediate close of a TCP connection, there is
a significant risk that the client will not be able to read the last
HTTP response. If the server receives additional data from the
client on a fully closed connection, such as another request that was
sent by the client before receiving the server's response, the
server's TCP stack will send a reset packet to the client;
unfortunately, the reset packet might erase the client's
unacknowledged input buffers before they can be read and interpreted
by the client's HTTP parser.

To avoid the TCP reset problem, servers typically close a connection
in stages. First, the server performs a half-close by closing only
the write side of the read/write connection. The server then
continues to read from the connection until it receives a
corresponding close by the client, or until the server is reasonably
certain that its own TCP stack has received the client's
acknowledgement of the packet(s) containing the server's last
response. Finally, the server fully closes the connection.

Like DynamoDB, Alternator rejects requests larger than some fixed maximum size (16MB). We had a test for this feature - test_too_large_request, but it was too blunt, and missed two issues: Refs #8195 Refs #8196 So this patch adds two better tests that reproduce these two issues: First, test_too_large_request_chunked verifies that an oversized request is detected even if the body is sent with chunked encoding. Second, both tests - test_too_large_request_chunked and test_too_large_request_content_length - verify that the rather limited (and arguably buggy) Python HTTP client is able to read the 413 status code - and doesn't report some generic I/O error. Both tests pass on DynamoDB, but fail on Alternator because of these two open issues. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210302154555.1488812-1-nyh@scylladb.com>

nyh · 2022-11-20T10:57:53Z

Curiously, this bug seems to have been solved on its own, and test/alternator/run --runxfail -v test_manual_requests.py::test_too_large_request_content_length now XPASSes on my Fedora 36 setup.

It may have been solved not by the AWS library but one of the the Python libraries it uses. I see the urllib3 library recently fixed this issue (see urllib3/urllib3#1524), and it reached urllib3 version 1.26.0 (dated 2020-11-10), I confirmed that botocore does use urllib3, but did not confirm that downgrading urllib3 reproduces the bug.

There's still an interesting discussion whether closing the connection as Scylla does is the right thing or we should read the entire request before closing the connection, but I think it's an academic discussion and there's no need to do it if the test passes. We should remove the "xfail" marker (but only if the library is new enough?) and close this issue.

We had an xfailing test that reproduced a case where Alternator tried to report an error when the request was too long, but the boto library didn't see this error and threw a "Broken Pipe" error instead. It turns out that this wasn't a Scylla bug but rather a bug in urllib3, which overzealously reported a "Broken Pipe" instead of trying to read the server's response. It turns out this issue was already fixed in urllib3/urllib3#1524 and now, on modern installations, the test that used to fail now passes and reports "XPASS". So in this patch we remove the "xfail" tag, and skip the test if running an old version of urllib3. Fixes scylladb#8195 Signed-off-by: Nadav Har'El <nyh@scylladb.com> x

We had an xfailing test that reproduced a case where Alternator tried to report an error when the request was too long, but the boto library didn't see this error and threw a "Broken Pipe" error instead. It turns out that this wasn't a Scylla bug but rather a bug in urllib3, which overzealously reported a "Broken Pipe" instead of trying to read the server's response. It turns out this issue was already fixed in urllib3/urllib3#1524 and now, on modern installations, the test that used to fail now passes and reports "XPASS". So in this patch we remove the "xfail" tag, and skip the test if running an old version of urllib3. Fixes scylladb#8195

nyh · 2022-12-04T21:56:39Z

This is a test patch to enable a test that was previously disabled. It shouldn't be backported.

We had an xfailing test that reproduced a case where Alternator tried to report an error when the request was too long, but the boto library didn't see this error and threw a "Broken Pipe" error instead. It turns out that this wasn't a Scylla bug but rather a bug in urllib3, which overzealously reported a "Broken Pipe" instead of trying to read the server's response. It turns out this issue was already fixed in urllib3/urllib3#1524 and now, on modern installations, the test that used to fail now passes and reports "XPASS". So in this patch we remove the "xfail" tag, and skip the test if running an old version of urllib3. Fixes scylladb#8195 Closes scylladb#12038

nyh added the area/alternator Alternator related Issues label Mar 2, 2021

nyh mentioned this issue May 23, 2021

Alternator's health-check request doesn't work properly with HTTPS #8691

Closed

nyh mentioned this issue Nov 20, 2022

test/alternator: un-xfail a test which passes on modern Python #12038

Closed

scylladb-promoter closed this as completed in 757d2a4 Nov 21, 2022

scylladb-promoter added the Backport candidate label Nov 21, 2022

nyh mentioned this issue Dec 1, 2022

Alternator tests fails when the connection is closed before the reply is fully read #12143

Closed

nyh removed the Backport candidate label Dec 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternator's response to oversized requests confuses the client drivers #8195

Alternator's response to oversized requests confuses the client drivers #8195

nyh commented Mar 2, 2021 •

edited

nyh commented Mar 2, 2021

nyh commented Mar 2, 2021 •

edited

nyh commented Nov 20, 2022 •

edited

nyh commented Dec 4, 2022

Alternator's response to oversized requests confuses the client drivers #8195

Alternator's response to oversized requests confuses the client drivers #8195

Comments

nyh commented Mar 2, 2021 • edited

nyh commented Mar 2, 2021

nyh commented Mar 2, 2021 • edited

nyh commented Nov 20, 2022 • edited

nyh commented Dec 4, 2022

nyh commented Mar 2, 2021 •

edited

nyh commented Mar 2, 2021 •

edited

nyh commented Nov 20, 2022 •

edited