Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keepalive pings are sent every [Time + Timeout] period; not every [Time] period #2638

Closed
dfawley opened this issue Feb 12, 2019 · 6 comments
Closed
Assignees

Comments

@dfawley
Copy link
Member

dfawley commented Feb 12, 2019

No description provided.

@dfawley dfawley self-assigned this Feb 12, 2019
@dfawley dfawley assigned easwars and unassigned dfawley Apr 11, 2019
easwars added a commit to easwars/grpc-go that referenced this issue Apr 19, 2019
This commit makes the following changes:
* Keep track of the time of the last read in the transport.
* Use this in the keepalive implementation to decide when to send out
  keepalives.
* Address the issue of keepalives being sent every [Time+Timeout] period
  instead of every [Time] period, as mandated by proposal A8.
* Makes many of the transport tests to run in parallel (as most of them
  spend a lot of time just sleeping, waiting for things to happen).

Proposal A8 is here:
https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md

This commit addresses:
grpc#2638
@jhump
Copy link
Member

jhump commented May 11, 2019

I'm pretty sure this is the root cause for a bug filed against grpcurl: fullstorydev/grpcurl#99

easwars added a commit to easwars/grpc-go that referenced this issue Jun 11, 2019
This commit makes the following changes:
* Keep track of the time of the last read in the transport.
* Use this in the keepalive implementation to decide when to send out
  keepalives.
* Address the issue of keepalives being sent every [Time+Timeout] period
  instead of every [Time] period, as mandated by proposal A8.
* Makes many of the transport tests to run in parallel (as most of them
  spend a lot of time just sleeping, waiting for things to happen).

Proposal A8 is here:
https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md

This commit addresses:
grpc#2638
ajwerner added a commit to ajwerner/cockroach that referenced this issue Jul 25, 2019
This PR upgrades gRPC from 1.13.0 to 1.21.2. The primary motivation for this
upgrade is to eliminate the disconnections caused by
grpc/grpc-go#1882. These failures manifest themselves
as the following set of errors:

```
ajwerner-test-0001> I190722 22:15:01.203008 12054 vendor/github.com/cockroachdb/circuitbreaker/circuitbreaker.go:322  [n1] circuitbreaker: rpc [::]:26257 [n2] tripped: failed to check for ready connection to n2 at ajwerner-test-0002:26257: connection not ready: TRANSIENT_FAILURE
```
Which then lead to tripped breakers and general badness. I suspect that there
are several other good bug fixes in here, including some purported leaks and
correctness fixes on shutdown.

I have verified that with this upgrade I no longer see connections break in
overload scenarios which reliably reproduced the situation in the above log.

This commit removes one condition from grpcutil.IsClosedConnection which should
be subsumed by the status check above. The `transport` subpackage has not been
around for many releases.

This does not upgrade to the current release 1.22.0 because the maintainer
mentions that it contains a bug
(grpc/grpc-go#2663 (comment)).

This change also unfortunately updates the keepalive behavior to be more spec
compliant (grpc/grpc-go#2642). This change mandates
a minimum ping time of 10s to the client. Given grpc/grpc-go#2638
this means that the rpc test for keepalives now takes over 20s.

I would be okay skipping it but leave that discussion for review.

Also updated the acceptance test to look out for an HTTP/2.0 header because
grpc now does not send RPCs until after the HTTP handshake has completed
(see grpc/grpc-go#2406).

Release note (bug fix): Upgrade grpc library to fix connection state management
bug.
easwars added a commit to easwars/grpc-go that referenced this issue Aug 20, 2019
This commit makes the following changes:
* Keep track of the time of the last read in the transport.
* Use this in the keepalive implementation to decide when to send out
  keepalives.
* Address the issue of keepalives being sent every [Time+Timeout] period
  instead of every [Time] period, as mandated by proposal A8.
* Makes many of the transport tests to run in parallel (as most of them
  spend a lot of time just sleeping, waiting for things to happen).

Proposal A8 is here:
https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md

This commit addresses:
grpc#2638
easwars added a commit to easwars/grpc-go that referenced this issue Aug 20, 2019
This commit makes the following changes:
* Keep track of the time of the last read in the transport.
* Use this in the keepalive implementation to decide when to send out
  keepalives.
* Address the issue of keepalives being sent every [Time+Timeout] period
  instead of every [Time] period, as mandated by proposal A8.
* Makes many of the transport tests to run in parallel (as most of them
  spend a lot of time just sleeping, waiting for things to happen).

Proposal A8 is here:
https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md

This commit addresses:
grpc#2638
easwars added a commit to easwars/grpc-go that referenced this issue Aug 30, 2019
This commit makes the following changes:
* Keep track of the time of the last read in the transport.
* Use this in the keepalive implementation to decide when to send out
  keepalives.
* Address the issue of keepalives being sent every [Time+Timeout] period
  instead of every [Time] period, as mandated by proposal A8.
* Makes many of the transport tests to run in parallel (as most of them
  spend a lot of time just sleeping, waiting for things to happen).

Proposal A8 is here:
https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md

This commit addresses:
grpc#2638
@stale
Copy link

stale bot commented Sep 6, 2019

This issue is labeled as requiring an update from the reporter, and no update has been received after 7 days. If no update is provided in the next 7 days, this issue will be automatically closed.

@stale stale bot added the stale label Sep 6, 2019
@jhump
Copy link
Member

jhump commented Sep 6, 2019

@dfawley, I take it you are on top of the misattribution of "stale" to all of these and I should be able to just ignore the GH notifications I am getting?

@stale stale bot removed the stale label Sep 6, 2019
@dfawley
Copy link
Member Author

dfawley commented Sep 6, 2019

@jhump - Correct. I am still interested in discussion on #2909, however. :)

@easwars
Copy link
Contributor

easwars commented Oct 24, 2019

#3102 fixes the issue on the client side. A similar fix on the server side is required.

easwars added a commit to easwars/grpc-go that referenced this issue Nov 12, 2019
This PR contains the server side changes corresponding to the client
side changes made in grpc#3102.

Apart from the fix for the issue mentioned in
grpc#2638, this PR also makes some
minor code cleanup and fixes the channelz test for keepalives count.
easwars added a commit that referenced this issue Nov 19, 2019
This PR contains the server side changes corresponding to the client
side changes made in #3102.

Apart from the fix for the issue mentioned in
#2638, this PR also makes some
minor code cleanup and fixes the channelz test for keepalives count.
RealBar pushed a commit to RealBar/grpc-go that referenced this issue Nov 21, 2019
This PR contains the server side changes corresponding to the client
side changes made in grpc#3102.

Apart from the fix for the issue mentioned in
grpc#2638, this PR also makes some
minor code cleanup and fixes the channelz test for keepalives count.
@easwars
Copy link
Contributor

easwars commented Nov 21, 2019

This is fixed now, on both client and server side.

@easwars easwars closed this as completed Nov 21, 2019
@lock lock bot locked as resolved and limited conversation to collaborators May 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants