Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip broken pipe errors #1524

Merged
merged 32 commits into from Sep 11, 2020
Merged

Conversation

robermorales
Copy link
Contributor

In the case the server legitimally close a connection but we are still
sending data (e.g. sending a POST request), the old code prevents the
response to be retrieved.

RFC2616 refers different scenarios where a server can legitimately close the connection.

https://tools.ietf.org/html/rfc2616#section-8.1.2 (8.1.2.1)
https://tools.ietf.org/html/rfc2616#section-10.4.14

With this change, we avoid a broken pipe in the request write process,
to end the read process.

I tested curl behavior when presented to this exact scenario and it is also ignoring the broken pipes and continue parsing the legitimate response.

This behavior often get reached when using urllib3 through requests, so the related issues are from it:

Related issues:

psf/requests#4778
psf/requests#2422

@codecov-io
Copy link

codecov-io commented Jan 11, 2019

Codecov Report

Merging #1524 into master will decrease coverage by 0.45%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1524      +/-   ##
==========================================
- Coverage   99.81%   99.35%   -0.46%     
==========================================
  Files          24       24              
  Lines        2183     2187       +4     
==========================================
- Hits         2179     2173       -6     
- Misses          4       14      +10     
Impacted Files Coverage Δ
src/urllib3/connection.py 100.00% <ø> (ø)
src/urllib3/connectionpool.py 100.00% <100.00%> (ø)
src/urllib3/contrib/_appengine_environ.py 90.00% <0.00%> (-10.00%) ⬇️
src/urllib3/util/wait.py 92.18% <0.00%> (-7.82%) ⬇️
src/urllib3/util/ssl_.py 97.77% <0.00%> (-2.23%) ⬇️
src/urllib3/response.py 99.45% <0.00%> (-0.55%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f3472ec...2335967. Read the comment docs.

Copy link
Member

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this! Could we add a test that fails with the current master but passes with this branch?

src/urllib3/connectionpool.py Outdated Show resolved Hide resolved
src/urllib3/connectionpool.py Outdated Show resolved Hide resolved
src/urllib3/connectionpool.py Show resolved Hide resolved
@robermorales
Copy link
Contributor Author

I made the changes you suggested. I cannot add the test right now, but will try add it and commit to the branch soon.

Thanks,

Copy link
Member

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's a lint error on Travis, could you take a look at this?

I need to try running that testcase against an earlier version of urllib3 to verify it works correctly.

test/with_dummyserver/test_connectionpool.py Outdated Show resolved Hide resolved
Copy link
Member

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more comment then this looks good for a final review.

test/with_dummyserver/test_connectionpool.py Outdated Show resolved Hide resolved
@sethmlarson
Copy link
Member

So I ran this code with the old urllib3 without the exception handling and it passed fine, are we sure we're triggering the pipe error with that testcase?

@robermorales
Copy link
Contributor Author

So I ran this code with the old urllib3 without the exception handling and it passed fine, are we sure we're triggering the pipe error with that testcase?

@sethmlarson I tried to reproduce a test case that fails with the old implementation and passes with the new one. I coudn't mimic a broken pipe in the test application, but I found a test that passes only with the change. What do you think that could be a good way to mimic the behaviour that we are covering? I am open to implement a new test that better covers it.

@sethmlarson
Copy link
Member

I'd definitely like to trigger a pipe error as least, maybe we need a lower-level interface than what Tornado is giving us to call a socket.close() before reading the body? Try looking in test/with_dummyserver/test_socketlevel.py, maybe we move this testcase to that file instead?

@robermorales
Copy link
Contributor Author

@sethmlarson cool idea! I will return with some implementation. Thanks for all your help!

@sethmlarson
Copy link
Member

Of course! :) Thank you for all your effort tackling this issue!

@bbbco
Copy link

bbbco commented Mar 1, 2019

Any updates on this?

@sethmlarson
Copy link
Member

Nope, the original author appears to be MIA.

@robermorales
Copy link
Contributor Author

Sorry for the delay, I will return to the development of this soon. Thanks!

robermorales and others added 6 commits May 24, 2019 11:12
In the case the server legitimally close a connection but we are still
sending data (e.g. sending a POST request), the old code prevents the
response to be retrieved.

With this change, we avoid a broken pipe in the request write process,
to end the read process.
Added a comment with the rationale.
@EmptyRabbit
Copy link

when could this pr merge...... I blocked by this cause...

@sethmlarson
Copy link
Member

I'll be taking a look at this again after the holidays.

@sethmlarson
Copy link
Member

@hodbn An interesting article about the error you're running into: http://erickt.github.io/blog/2014/11/19/adventures-in-debugging-a-potential-osx-kernel-bug/

When the server closed the socket but the client still tries to read the
response, macOS can fail with EPROTOTYPE instead of simply EPIPE or
ESHUTDOWN, so BrokenPipeError isn't enough.

I also had to fix the test, otherwise the request would fail because it
didn't read the whole response (which is still a healthy thing to do).
Copy link
Member

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one nit then this LGTM

src/urllib3/connectionpool.py Outdated Show resolved Hide resolved
@sethmlarson sethmlarson added this to the v1.26 milestone Aug 27, 2020
@sethmlarson
Copy link
Member

Sorry this stalled out, that's my fault. Trying to get it landed for v1.26.

Pulled this down locally and verified the current set of changes works nicely on Linux.
For Windows I added 10053 which is "Connection Aborted Error" which can happen for multiple reasons it seems (antivirus, firewall, and your peer closing the connection) so hopefully that'll fix up the issue we were seeing on Windows.

@sethmlarson
Copy link
Member

sethmlarson commented Aug 28, 2020

I'm not happy to skip the test on Windows but I'm unsure if there's anything we can do? Windows seems to be erroring out consistently on sock.recv() once the socket is closed so we can't call conn.getresponse() and get anything meaningful.

@sethmlarson sethmlarson dismissed their stale review August 28, 2020 19:14

Have made some changes myself

hodbn
hodbn previously approved these changes Sep 10, 2020
pquentin
pquentin previously approved these changes Sep 11, 2020
Copy link
Member

@pquentin pquentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

We'll have to merge from master again to make sure we don't have two "notWindows" decorators

@sethmlarson sethmlarson dismissed stale reviews from pquentin and hodbn via 7d47b1c September 11, 2020 13:58
@sethmlarson
Copy link
Member

Thanks @pquentin and @hodbn for the reviews and @robermorales for putting together this change. This is a bigger fix than you think it is, many users are going to see real improvements to their network and API performance. 🎉

@sethmlarson sethmlarson merged commit e4c43f5 into urllib3:master Sep 11, 2020
@bessbd
Copy link

bessbd commented Oct 29, 2020

Hi @sethmlarson ,

When do you think this fix would be included in an upcoming release?
Is there something I could do to speed that process up? Should I open a PR that backports this change to any branch?

Thank you,
Donat

@sethmlarson
Copy link
Member

sethmlarson commented Oct 29, 2020

@bessbd This change won't be backported since it's not a bug / security fix. Not much to speed things up, we're wrapping up the last things that need to be done for 1.26.

The 1.26 release is special since it's the last minor before v2 so we're trying to wrap it up a lot more "completely" compared to a typical minor release.

nyh added a commit to nyh/scylla that referenced this pull request Nov 20, 2022
We had an xfailing test that reproduced a case where Alternator tried
to report an error when the request was too long, but the boto library
didn't see this error and threw a "Broken Pipe" error instead. It turns
out that this wasn't a Scylla bug but rather a bug in urllib3, which
overzealously reported a "Broken Pipe" instead of trying to read the
server's response. It turns out this issue was already fixed in
   urllib3/urllib3#1524

and now, on modern installations, the test that used to fail now passes
and reports "XPASS".

So in this patch we remove the "xfail" tag, and skip the test if
running an old version of urllib3.

Fixes scylladb#8195

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

x
nyh added a commit to nyh/scylla that referenced this pull request Nov 20, 2022
We had an xfailing test that reproduced a case where Alternator tried
to report an error when the request was too long, but the boto library
didn't see this error and threw a "Broken Pipe" error instead. It turns
out that this wasn't a Scylla bug but rather a bug in urllib3, which
overzealously reported a "Broken Pipe" instead of trying to read the
server's response. It turns out this issue was already fixed in
   urllib3/urllib3#1524

and now, on modern installations, the test that used to fail now passes
and reports "XPASS".

So in this patch we remove the "xfail" tag, and skip the test if
running an old version of urllib3.

Fixes scylladb#8195
nyh added a commit to nyh/scylla that referenced this pull request Nov 20, 2022
We had an xfailing test that reproduced a case where Alternator tried
to report an error when the request was too long, but the boto library
didn't see this error and threw a "Broken Pipe" error instead. It turns
out that this wasn't a Scylla bug but rather a bug in urllib3, which
overzealously reported a "Broken Pipe" instead of trying to read the
server's response. It turns out this issue was already fixed in
   urllib3/urllib3#1524

and now, on modern installations, the test that used to fail now passes
and reports "XPASS".

So in this patch we remove the "xfail" tag, and skip the test if
running an old version of urllib3.

Fixes scylladb#8195
denesb pushed a commit to scylladb/scylladb that referenced this pull request Nov 21, 2022
We had an xfailing test that reproduced a case where Alternator tried
to report an error when the request was too long, but the boto library
didn't see this error and threw a "Broken Pipe" error instead. It turns
out that this wasn't a Scylla bug but rather a bug in urllib3, which
overzealously reported a "Broken Pipe" instead of trying to read the
server's response. It turns out this issue was already fixed in
   urllib3/urllib3#1524

and now, on modern installations, the test that used to fail now passes
and reports "XPASS".

So in this patch we remove the "xfail" tag, and skip the test if
running an old version of urllib3.

Fixes #8195

Closes #12038
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants