Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace use of grpc.Picker with grpc.Balancer #875

Closed
stevvooe opened this issue Jun 9, 2016 · 18 comments
Closed

Replace use of grpc.Picker with grpc.Balancer #875

stevvooe opened this issue Jun 9, 2016 · 18 comments
Assignees
Milestone

Comments

@stevvooe
Copy link
Contributor

stevvooe commented Jun 9, 2016

We knew this change was coming, just not quite when. We need to replace use of Picker with Balancer.

@aluzzardi
Copy link
Member

Moving this to 1.13, WDYT? @stevvooe @LK4D4 @aaronlehmann

@aluzzardi aluzzardi added this to the 1.13.0 milestone Aug 5, 2016
@aaronlehmann
Copy link
Collaborator

Definitely 1.13. Note that we are moving away from pickers anyway in recent PRs due to problems with NotifyReset.

@stevvooe stevvooe self-assigned this Aug 8, 2016
@aluzzardi
Copy link
Member

@stevvooe @LK4D4 @aaronlehmann: Should we fix the issue with worker reconnection here? Should we log an issue?

@stevvooe
Copy link
Contributor Author

stevvooe commented Aug 8, 2016

@aluzzardi Let's log another issue.

@aluzzardi
Copy link
Member

@stevvooe @LK4D4 @aaronlehmann Can one of you formulate the problem in a separate issue? You have more context than I do regarding the reconnection problem.

@aaronlehmann
Copy link
Collaborator

The reconnection issue is tracked in moby/moby#25017 and nine out of ten related PRs have already been marged.

@aluzzardi
Copy link
Member

@aaronlehmann Didn't we toy with the idea of re-designing the peer selection part?

@LK4D4
Copy link
Contributor

LK4D4 commented Aug 9, 2016

@aluzzardi we slightly changed it. It seems to fit our needs for 1.12.1. More sophisticated refactoring should be done as part of moving to grpc.Balancer.

@dperny
Copy link
Collaborator

dperny commented Aug 9, 2016

nine out of ten related PRs have already been marged

image

@aaronlehmann
Copy link
Collaborator

Considering this a P1 because it's blocking our ability to upgrade the GRPC version, which is preventing us from fixing other bugs.

@aaronlehmann aaronlehmann assigned dperny and unassigned stevvooe Aug 15, 2016
@LK4D4
Copy link
Contributor

LK4D4 commented Aug 19, 2016

@aaronlehmann @dperny I propose to not use Balancer for now. Balancer is supposed to create multiple connections, using it for choosing only one connection would bring only pain and confusing. Also, it's still experimental API according to grpc godoc. I propose to just replace CA picker with same code as in agent: Select + recreate on any error, it's actually almost same what it does now.

@dperny
Copy link
Collaborator

dperny commented Aug 19, 2016

I'm glad you said this, because I still wasn't sure what the right course of action was.

@stevvooe
Copy link
Contributor Author

Here is the documentation for grpc.Balancer.Get:

 // Get gets the address of a server for the RPC corresponding to ctx.
    // i) If it returns a connected address, gRPC internals issues the RPC on the
    // connection to this address;
    // ii) If it returns an address on which the connection is under construction
    // (initiated by Notify(...)) but not connected, gRPC internals
    //  * fails RPC if the RPC is fail-fast and connection is in the TransientFailure or
    //  Shutdown state;
    //  or
    //  * issues RPC on the connection otherwise.
    // iii) If it returns an address on which the connection does not exist, gRPC
    // internals treats it as an error and will fail the corresponding RPC.
    //
    // Therefore, the following is the recommended rule when writing a custom Balancer.
    // If opts.BlockingWait is true, it should return a connected address or
    // block if there is no connected address. It should respect the timeout or
    // cancellation of ctx when blocking. If opts.BlockingWait is false (for fail-fast
    // RPCs), it should return an address it has notified via Notify(...) immediately
    // instead of blocking.
    //
    // The function returns put which is called once the rpc has completed or failed.
    // put can collect and report RPC stats to a remote load balancer.
    //
    // This function should only return the errors Balancer cannot recover by itself.
    // gRPC internals will fail the RPC if an error is returned.

The assertion that this can't be used to manage a multi-homed connection is incorrect. Get is called on every RPC. In fact, we can use it to control that connection with almost no complexity for the caller.

We really shouldn't be passing around pickers and remotes. We should be passing around client objects.

@LK4D4
Copy link
Contributor

LK4D4 commented Aug 19, 2016

@stevvooe I'd really prefer not to use another experimental grpc stuff instead of previous. There is a risk that we'll end with rewriting again.
But maybe you're right, but code using balancer in grpc is non-obvious for me. You can't return just random address from Get, only that for which grpc already created connection and I don't understand how to initiate that connection apart from Notify.

@stevvooe
Copy link
Contributor Author

@LK4D4 I think we just pass a naming.Resolver and use the RoundRobin implementation. I don't think we need to implement a balancer.

@LK4D4
Copy link
Contributor

LK4D4 commented Aug 19, 2016

@stevvooe Still both Resolver and Balancer are experimental. Also, RoundRobin chooses connections between connected connections, in our case there is only one connected connection.

@dperny
Copy link
Collaborator

dperny commented Aug 29, 2016

@LK4D4 your PR successfully purged all Picker from our base, right?

@LK4D4
Copy link
Contributor

LK4D4 commented Aug 29, 2016

@dperny yes, there is no more picker.

@LK4D4 LK4D4 closed this as completed Aug 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants