Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transport: Add an Unwrap method to ConnectionError #5148

Merged
merged 1 commit into from Feb 14, 2022

Conversation

thallgren
Copy link
Contributor

@thallgren thallgren commented Jan 19, 2022

Adds an Unwrap method to enable error checking with errors.Is.

RELEASE NOTES: N/A

Adds an `Unwrap` method to enable error checking with `errors.Is`.

Signed-off-by: Thomas Hallgren <thomas@tada.se>
@menghanl
Copy link
Contributor

menghanl commented Jan 19, 2022

Can you explain how you expect this to be used?
If I didn't miss anything, transport.ConnectionError is never returned to the users directly.
And the status error (returned to the users) doesn't support Unwrap it seems?

@thallgren
Copy link
Contributor Author

thallgren commented Jan 22, 2022

The transport.ConnectionError is indeed returned by grpc.DialContext() now. I don't think it used to be though. We intended to upgrade to a more recent version and that caused our tests to fail. I managed to create a workaround for it but the real problem is the missing Unwrap, hence this PR.

Here's a code snipped showing the workaround:

		conn, err := grpc.DialContext(ctx, "unix:"+socketName, append([]grpc.DialOption{
			grpc.WithTransportCredentials(insecure.NewCredentials()),
			grpc.WithNoProxy(),
			grpc.WithBlock(),
			grpc.FailOnNonTempDialError(true),
		}, opts...)...)
		if err == nil {
			return conn, nil
		}

		// The google.golang.org/grpc/internal/transport.ConnectionError does not have an
		// Unwrap method. It does have a Origin method though.
		// See: https://github.com/grpc/grpc-go/pull/5148
		if oe, ok := err.(interface{ Origin() error }); ok {
			err = oe.Origin()
		}
		if firstTry && errors.Is(err, unix.ECONNREFUSED) {
		...

@thallgren
Copy link
Contributor Author

Additional info. The tests that fail for us is an "orphaned socket" test. The client gets the transport.ConnectionError after the server has done this:

		listener, err := net.Listen("unix", sockname)
		if !assert.NoError(t, err) {
			return
		}
		listener.(*net.UnixListener).SetUnlinkOnClose(false)
		listener.Close()

@@ -2399,3 +2400,10 @@ func (s) TestClientDecodeHeaderStatusErr(t *testing.T) {
})
}
}

func TestConnectionError_Unwrap(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't test the actual usage the user (you) are expecting. Is it possible to add a test in /test at the grpc.Dial level for this, to make sure we don't break it in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dfawley Sorry to see that this PR was closed. I didn't keep a close eye on it after it was approved. The test is for the actual problem. It probably affects several different use-cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is performed in an internal package, and does not go through any public APIs. So the problem is, if we stop returning transport.ConnectionError and do something different instead, whatever externally-visible use case(s) you have will be broken, but none of our tests will fail. That makes this test much less useful than something in grpc/test. Let me know if you need any help figuring out how to add a test in that package, as it can be tricky.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dfawley thanks for offering to help figuring out how to make a more encompassing test. Unfortunately, I'm in a situation where I don't have more time to spend. I created a workaround in our code when we got hit by the regression. Then, as a courtesy to you (since I really appreciate this package), I provided a PR with a fix, and a unit test to verify that the fix does what it's supposed to do. I also explained exactly how to reproduce the use-case to get the error from DialContext. I'm afraid that's as a far as I'm able to go with this, at least for the time being.

Thanks again for an excellent package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a workaround in our code when we got hit by the regression.

AIUI you were relying on undocumented / unintentional behavior. This whole flow isn't something we recommend to users -- WithBlock / WithReturnConnectionError are not things we recommend using, and are not provided in most/any other languages' implementations of gRPC. As such, it is not a priority for us to maintain it. If we take this PR without an appropriate test, there's a reasonable chance it will break again in the future, and fixing it will be an extremely low priority for us, given the above. With a test, there's a much better chance we would notice it and prevent the breakage in the first place.

If you're okay with this risk and the lack of real support of this feature, we can merge this as-is. If you would like to discuss other ways of doing what you're attempting that are better supported, then we are happy to help with that as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for offering to help but we have good reasons for the approach chosen and no desire to change it at this point. I'd like you to merge this as-is.

@github-actions
Copy link

This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

@github-actions github-actions bot added the stale label Jan 31, 2022
@github-actions github-actions bot closed this Feb 7, 2022
@dfawley dfawley reopened this Feb 11, 2022
@dfawley dfawley changed the title Add an Unwrap method to transport.ConnectionError transport: Add an Unwrap method to ConnectionError Feb 14, 2022
@dfawley dfawley merged commit 46009ac into grpc:master Feb 14, 2022
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants