Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core, grpclb,xds: let leaf LB policies explicitly refresh name resolution when subchannel connection is broken #8048

Merged

Conversation

voidzcy
Copy link
Contributor

@voidzcy voidzcy commented Apr 2, 2021

Currently each subchannel implicitly refreshes the name resolution when its connection is broken. That is, this feature is built into subchannel's internal implementation. Although it eliminates the burden of having LB implementations refreshing the resolver when connections to backends are broken, this is gives LB policies no chance to disable or override this refresh (e.g., in some complex load balancing hierarchy like xDS, LB policies may embed a resolver inside for resolving backends so the refreshing resolution operation should be hooked to the resolver embedded in the LB policy instead of the one in Channel).

This is likely to be a breaking change for users implementing their own LB policies (performing load balancing directly on backends). In order to make this transition smoothly, we add a check to SubchannelImpl that checks if the LoadBalancer has explicitly called Helper.refreshNameResolution for broken subchannels created by it. If not, it logs a warning and do the refresh.

A temporary LoadBalancer.Helper API ignoreRefreshNameResolution() is added to avoid false-positive warnings for xDS that intentionally does not want a refresh. Once the migration is done, this should be deleted.

See details in #8088.

@voidzcy voidzcy changed the title core, grpclb: let leaf LB policies explicitly refresh name resolution when subchannel connection is broken core, grpclb,xds: let leaf LB policies explicitly refresh name resolution when subchannel connection is broken Apr 9, 2021
@voidzcy voidzcy requested a review from ejona86 April 9, 2021 22:18
Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any ideas how we're supposed to notice bugs with LB behavior? It would seem trivial for a LB policy to not be triggering refreshes and no test would fail.

…l's refresh for cases LoadBalancer intentionally does not want a refresh.
…when handling subchannel state changes. Log a warning and do a refresh if it is not.
… a name resolution refresh for the resolver in Channel, so use ignoreRefreshNameResolutionCheck() to avoid false-positive warnings.
@voidzcy voidzcy force-pushed the bugfix/move_subchannel_refreshing_ns_to_lb branch from 9256ec1 to 2ff23c6 Compare April 15, 2021 19:34
@voidzcy
Copy link
Contributor Author

voidzcy commented Apr 15, 2021

Do we have any ideas how we're supposed to notice bugs with LB behavior? It would seem trivial for a LB policy to not be triggering refreshes and no test would fail.

Fixed by adding a flag. The Channel will still do the automatic refresh if the LoadBalancer did not do so. PTAL.

Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to help check that we've updated our code appropriately, replace the warning with an exception/panic and run the tests to make sure they pass. After a few releases with the warning, I wouldn't be surprised if we want to do something similar for a grpc-java release.

api/src/main/java/io/grpc/LoadBalancer.java Outdated Show resolved Hide resolved
api/src/main/java/io/grpc/LoadBalancer.java Outdated Show resolved Hide resolved
@voidzcy voidzcy merged commit 9614738 into grpc:master Apr 16, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants